High-throughput transcriptome and functional validation analysis

ABSTRACT

Methods for correlating genes and gene function are provided. Such methods generally involve selecting a candidate gene that appears to be correlated with a particular cellular state or activity and then validating the role of the candidate gene in establishment of such a cellular state or activity. Certain methods utilize RNA interference techniques in the validation process.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part application of U.S. patentapplication Ser. No. 10/027,807, filed Oct. 19, 2001, which is acontinuation-in-part of U.S. patent application Ser. No. 09/627,362,filed Jul. 28, 2000, which claims the benefit of U.S. ProvisionalApplication No. 60/146,640, filed Jul. 30, 1999, all of which areincorporated herein in their entirety for all purposes.

BACKGROUND

It is estimated that while over 100,000 genes are expressed by amammalian genome, only a fraction are expressed in any particular cellor tissue. Gene expression patterns, especially as reflected in theabundance of mRNAs, vary according to cell or tissue type, withdevelopmental or metabolic state, in response to insult or injury, andas a consequence of other genetic and environmental factors. Moreover,the pattern of expression changes in a dynamic fashion over time withchanges in cell state and environment. The term “transcriptome” has beencoined to describe the set of all genes expressed, at any given time,under defined conditions in a given tissue (Velculescu et al., 1997,Cell 88:243-51).

The detection of changes to the transcriptome can provide usefulinformation regarding the identity of genes and gene products importantin development, drug response, and, particularly, human diseaseprocesses. However, methods now used for identifying changes in thetranscriptome suffer from a variety of deficiencies, e.g., they areexpensive, require relatively large quantities of starting material,and/or do not efficiently identify low abundance transcripts importantin mediating cell processes.

While a change in the expression of a particular gene between differentcell states is evidence that the gene may be responsible for thedifference in cell states, it would be preferable that the putative roleassigned to the gene be validated. Such validation ideally would involvean assay system in which one can interrogate what effect, if any,modulation of expression of the gene has on a cellular state or cellularactivity. If modulation of expression was found to be correlated with achange in cellular state or activity, this would substantiate theputative role for the gene. Thus, there remains a need for highthroughput methods for first identifying genes that appear to play arole in a particular cellular state or activity and then validating thatthe gene does in fact have such a role.

BRIEF SUMMARY OF THE INVENTION

One aspect of the present invention provides a method for identifyingand producing an active double-stranded RNA (dsRNA) which attenuates adesired gene expression in a cell. In one particular embodiment, themethod for identifying and producing an active dsRNA comprises: (a)producing a plurality of cDNA, wherein each cDNA comprises at least aportion of a gene that is expressed in a cell; (b) producing a candidatedsRNA from at least one of the cDNAs; (c) introducing the candidatedsRNA into a reference cell having a gene expression similar to the cellin step (a); and (d) identifying an active dsRNA by determining whetherthe candidate dsRNA attenuates a desired gene expression in thereference cell.

Moreover, methods of the present invention can also include producingthe identified active dsRNA from the corresponding cDNA of step (a).Since methods of the present invention provide a library, preferably acomprehensive library, of cDNA, once the active dsRNA has beenidentified it can be readily synthesized by transcription of thecorresponding cDNA. Therefore, methods of the present invention do notrequire conventional chemical oligonucleotide synthesis and/oravailability of known gene sequences to produce the active dsRNA.

Identification of the active dsRNA include selecting a candidate geneand identifying whether the dsRNA of at least a portion of the candidategene is an active dsRNA by determining whether modulation of expressionof the candidate gene by dsRNA in a reference cell has a functionaleffect in the reference cell. The candidate gene is a gene that isexpressed in a test cell and/or a control cell, and/or is expressed at adetectably different level with respect to the test cell and the controlcell. The candidate gene can be an endogenous gene of the referencecell, or it can be present in the reference cell as an extrachromosomalgene. The test cell and control cell differ with respect to a particularcellular characteristic of interest. The active dsRNA alters a cellularactivity or a cellular state in the reference cell by modulating theexpression of the candidate gene.

Active dsRNA can be identified by a variety of methods, including byintroducing the candidate dsRNA into the reference cell and detecting analteration in a cellular activity or a cellular state in the referencecell. The alteration in a cellular activity or a cellular state in thereference cell indicates that the candidate gene plays a functional rolein the reference cell and that the candidate dsRNA is an active dsRNA.Preferably, the candidate dsRNA is selected such that it issubstantially identical to at least a part of the candidate gene.

In one embodiment, the cellular characteristic is cell health, the testcell is a diseased cell and the control cell is a healthy cell, and thecandidate gene is potentially correlated with a disease.

In another embodiment, the cellular characteristic is stage ofdevelopment and the test cell and the control cell are at differentstages of development, and the candidate gene is potentially correlatedwith mediating the change between the different stages of development.

In yet another embodiment, the cellular characteristic is cellulardifferentiation and the candidate gene is potentially correlated withcontrolling cellular differentiation.

Preferably, the plurality of cDNA, which is used to synthesize dsRNA, isproduced from at least one mRNA which is isolated from the cell. Theisolated mRNA is then reverse transcribed by any of the methodsconventionally known to one skilled in the art to produce the cDNA.Typically, the cDNA is then digested with one or more, preferably two,restriction enzymes to produce a plurality of similar length cDNAs. Inthis manner, a more comprehensive cDNA library is provided. In oneparticular embodiment of the present invention, the restriction enzymeis selected from the group consisting of Dpn1 and Rsa1. A plasmid or PCRfragment is then generated from the digested cDNAs by any of theconventional methods known to one skilled in the art. And the candidatedsRNA is the produced by transcription of the plasmid or the PCRfragment.

In another embodiment, the cDNA is produced from all mRNAs that areisolated from the control cell. This provides a comprehensive cDNAlibrary which comprises at least a portion of substantially all genesthat are actively expressed in the cell.

Another aspect of the present invention provides a method foridentifying and validating activity of an active dsRNA which attenuatesa desired gene expression in a cell. The method generally comprisesproducing a candidate dsRNA, introducing the candidate dsRNA into areference cell and identifying whether the candidate dsRNA is an activedsRNA by detecting an alteration in a cellular activity or a cellularstate in the reference cell.

Yet another aspect of the present invention provides a high-through putmethod for correlating genes and gene function, said method comprising:(a) producing a plurality of candidate dsRNAs from a plurality of cDNAsof a control cell such that each candidate dsRNA comprises at least aportion of a gene that is expressed in the control cell; (b) introducingeach of the candidate dsRNA into a plurality of separate reference celleach having a gene expression similar to the control cell in step (a);and (c) identifying which candidate dsRNA is an active dsRNA bydetecting an alteration in a cellular activity or a cellular state inthe reference cell, desired alteration indicating that the genecorresponding to the candidate dsRNA plays a functional role in thereference cell.

In one embodiment, the plurality of cDNAs is produced from a pluralityof mRNAs as described herein. Preferably, each candidate dsRNA issubstantially identical to at least a portion of the candidate gene.

Detecting an alteration in a cellular activity or a cellular state inthe reference cell can involve a variety of methods. For example, onecan detect modulation of ligand binding to a protein, detect a change inphenotype or determine whether the protein encoded by the candidate genebinds to another protein to form a complex that can becoimmunoprecipitated. Detecting a change in phenotype is particularlyuseful when the reference cell is a part of an organism. In addition,detecting an alteration in a cellular activity or a cellular state inthe reference cell can involve determining whether interference withexpression of the candidate gene in the reference cell is correlatedwith alteration of a cellular activity or cellular state. Interferencecan be achieved by introducing a double-stranded RNA into the referencecell that can specifically hybridize to the candidate gene.

The candidate gene can be selected from a normalized library preparedfrom cells of the same type as the test cell or the control cell. In oneparticular embodiment, the candidate gene is present in low abundance inthe normalized library.

In another embodiment, the candidate gene is a differentially expressedgene selected from a subtracted library that is enriched for genes thatare differentially expressed with respect to the test cell and thecontrol cell. Preferably, the subtracted library is also normalized andthe candidate gene is one of the genes that is both present in lowabundance and differentially expressed in the subtracted and normalizedlibrary.

In one particular embodiment of the present invention, the candidategene is selected by a method comprising: (i) preparing (A) atester-normalized cDNA library which is a normalized library preparedfrom test cells; (B) a driver-normalized cDNA library which is anormalized library prepared from control cells; (C) a tester-subtractedcDNA library which is enriched in one or more genes that areup-regulated with respect to the test cell and the control cell, and (D)a driver-subtracted cDNA library which is enriched in one or more genesthat are down-regulated with respect to the test cell and the controlcell; and (ii) identifying one or more clones from the normalizedlibraries and/or the subtracted libraries, wherein the candidate gene isone of the clones identified.

In one embodiment, identification of one or more clones from thenormalized libraries comprises: (A) contacting clones from thetester-normalized cDNA library with labeled probes derived from mRNAfrom test cells and contacting clones from the driver-normalized cDNAlibrary with labeled probes derived from mRNA from control cells underconditions whereby probes specifically hybridize with complementaryclones to form a first set of hybridization complexes; and (B) detectingat least one hybridization complex from the first set of hybridizationcomplexes to identify a clone from one of the normalized libraries whichis present in low abundance.

In another embodiment, identification of one or more clones from thenormalized libraries comprises: (A) contacting clones from thetester-subtracted cDNA library and contacting clones from thedriver-subtracted cDNA library with a population of labeled probes underconditions whereby probes from the population of probes specificallyhybridize with complementary clones to form a second set ofhybridization complexes, and wherein the population of labeled probes isderived from mRNA from test cells and control cells; and (B) detectingat least one hybridization complex from the second set of hybridizationcomplexes to identify a clone from one of the subtracted libraries whichis differentially expressed above a threshold level with respect to thesubtracted libraries.

Methods of the present invention can be used with a wide variety ofcells and cell types. For example, in one embodiment the test cell isobtained from a mammal that has had a stroke or is at risk for stroke.In another embodiment, the test cell is obtained from a mammal that hasneurological disorders or develop phenotypes mimicking humanneurological disorders.

The reference cell can be part of a cell culture, a tissue, part of anorganism, an embryo, neural, glial cell or a neuroblastoma cell. Thereference cell can be a mammalian cell. Preferably, the reference cellis human cell or a model system which is useful for investigating avariety of human diseases and/or illnesses.

In one embodiment, the reference cell is useful as a model system forinvestigating neurological disorders in humans. In one particularembodiment, the reference cell has increased sensitivity toN-methyl-D-aspartate, β-amyloid, peroxide, oxygen-glucose deprivation,or combinations thereof. In such cases, the detecting step can comprisesdetecting a decrease in cellular sensitivity to N-methyl-D-aspartate,β-amyloid, peroxide, oxygen-glucose deprivation, or combinationsthereof.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows duplicate arrays probed using the “knock-down” methods ofthe invention. Arrows show (A) presence of hybridization signal(triplicate spots) and (B) reduction of signal due to inclusion ofknock-down polynucleotide during hybridization. This figure shows aportion (detail) of a larger array.

FIG. 2. Clones representing a group that are upregulated in Rsa I, 6 h(tester) as opposed to Rsa I, 0 h (driver) and are of low hybridizationsignal (=low abundance) in tester and driver are increased in theirsignal (abundance) under condition of Library ID “F” (normalizedtester-subtracted) and PCR cycles=21, 23, 25, 27. Libraries (L) andnumbers of amplification steps in the second PCR cycle (N) are indicatedby the shorthand “LN.” For example, “A21” encodes a description ofLibrary ID “A” with second PCR cycle process length of 21 cycles.

FIG. 3. Clones representing a group that are upregulated in Rsa I, 6 h(tester) as opposed to Rsa I, 0 h (driver) and are of low hybridizationsignal (=low abundance) in tester and driver are increased in theirsignal (abundance) under condition of Library IDs “C” through “F”(normalized tester-subtracted), “H” through “K” (normalizeddriver-subtracted) and PCR cycles=25. Clones from Library IDs “A” and“B” are essentially unchanged.

FIG. 4. Clones representing groups that are upregulated in Rsa I, 6 h(tester) as opposed to Rsa I, 0 h (driver) and are of low, medium orhigh tester hybridization signal are normalized in their signal undercondition of Library ID “B”.

FIG. 5. A Western Blot showing inhibition of expression of eGFP(enhanced Green Fluorescent Protein) by eGFP dsRNA in a neuroblastomacell line (AGYNB-010) harboring a plasmid encoding for eGFP. The blotshows inhibition of eGFP expression for cells transfected with eGFPdsRNA (i.e., dsRNA corresponding to the entire eGFP coding region; lanes9 and 10) and for cells transfected with eGFP dsRNA from the C-terminus(dsEGFP-C; lanes 6-8). Untransfected cells (mock cells; lanes 1-2) andcells transfected with UCP-2 dsRNA (dsUCP2; lanes 3-5) served ascontrols and show little or no inhibition of eGFP expression. Anti-MAP2was used to assure equal loading.

FIG. 6A. A Western Blot showing inhibition of endogenous PARP by PARPdsRNA. Inhibition of endogenous PARP expression is observed forneuroblastoma cells (AGYNB-010) transfected with PARP dsRNA preparedfrom the C-terminus of PARP (dsPARP-C; lanes 3-6) or PARP dsRNA preparedfrom the N-terminus of PARP (dsPARP-N; lanes 7-10). Control cellstransfected with UCP-2 dsRNA, in contrast, still express endogenous PARP(lanes 1-2). Anti-MAP2 was used to assure equal loading.

FIGS. 6B-6D. Results showing that RNAi mediated inhibition of PARPexpression induces resistance to oxygen glucose deprivation (OGD). FIGS.6B and 6C show views of neuroblastoma cells (AGYNB-010 cells) subjectedto 6 hours of OGD. Cell viability was assayed by staining with afluorescent dye that preferentially stains healthy cells rather thandead cells. Cells transfected with dsPARP 3 hours after initiation ofOGD show significantly less cell death (FIG. 6C) as compared to controlcells transfected with dsEGFP (FIG. 6B). FIG. 6D is a chart showing thatAGYNB-010 cells transfected with dsPARP are rescued from cell deathfollowing 3 hours of OGD, whereas control cells that are eitheruntransfected (mock cells) or transfected with dsEGFP show significantcell death after 6 hours of OGD.

FIGS. 7A-7C. Charts showing sensitivity of the AGYNB-010 neuroblastomacell line to β-amyloid (FIG. 7A), N-methyl-D-aspartate (NMDA) (FIG. 7B)and oxygen glucose deprivation (OGD) (FIG. 7C).

FIGS. 8A and 8B are graphs depicting the expression of EGFP and UCP2 inthe presence of dsRNA.

FIGS. 9A-9D show dsRNA-mediated inhibition of expression of caspase-3(A), fas-activated kinase (FASTK, B), 144-3 (C) and3-hydroxy-3-methylglutaryl-Coenzyme A synthase (D). Control level ofeach mRNA was determined in cells transfected with dsEGFP RNA and inmock transfected cells. Levels of GAPDH expression served as controls toensure the quality of mRNA as well as equal amount of cDNA was used ineach reaction.

FIG. 10 is a graph depicting the effect of dsRNA in differentiated N2acells. Real-time PCR was used to measure the levels of 14-3-3 mRNA fromcells transfected with lipofectamine alone, dsRNA 14-3-3, and dsRNAEGFP. Data presented were mean from two technical repeats. Similarresults were obtained in two independent experiments.

DETAILED DESCRIPTION OF THE EMBODIMENTS

I. Definitions

As used in this specification and the appended claims, the singularforms “a,” “an” and “the” include plural references unless the contentclearly dictates otherwise.

Unless defined otherwise, all technical and scientific terms used hereinhave the meaning commonly understood by a person skilled in the art towhich this invention belongs. The following references provide one ofskill with a general definition of many of the terms used in thisinvention: Singleton et al., DICTIONARY OF MICROBIOLOGY AND MOLECULARBIOLOGY (2d ed. 1994); THE CAMBRIDGE DICTIONARY OF SCIENCE ANDTECHNOLOGY (Walker ed., 1988); THE GLOSSARY OF GENETICS, 5TH ED., R.Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, THEHARPER COLLINS DICTIONARY OF BIOLOGY (1991).

Various biochemical and molecular biology methods are well known in theart. For example, methods of isolation and purification of nucleic acidsare described in detail in WO 97/10365, WO 97/27317, Chapter 3 ofLaboratory Techniques in Biochemistry and Molecular Biology:Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic AcidPreparation, (P. Tijssen, ed.) Elsevier, N.Y. (1993); Chapter 3 ofLaboratory Techniques in Biochemistry and Molecular Biology:Hybridization With Nucleic Acid Probes, Part 1. Theory and Nucleic AcidPreparation, (P. Tijssen, ed.) Elsevier, N.Y. (1993); and Sambrook etal., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press,N.Y., (1989); and Current Protocols in Molecular Biology, (Ausubel, F.M. et al., eds.) John Wiley & Sons, Inc., New York (1987-1999),including supplements such as supplement 46 (April 1999).

As used herein, the following terms have the meanings ascribed to themunless specified otherwise:

The term “tissue,” as used herein in the context of a source of mRNA andcDNA, refers to any aggregation of morphologically or functionallyrelated cells, or cell systems, and thus includes cells (including invitro cultured cells), tissues, organs, and the like.

The term “library” as used herein, refers to a collection ofpolynucleotides (usually in the form of double-stranded cDNA) derivedfrom mRNA of a particular tissue. The polynucleotides of a library maybe, but are not necessarily, cloned into a vector.

The terms “nucleic acid” “polynucleotide” and “oligonucleotide” are usedinterchangably herein and refer to a deoxyribonucleotide orribonucleotide polymer in either single- or double-stranded form, andunless otherwise limited, encompasses known analogs of naturalnucleotides that hybridize to nucleic acids in a manner similar tonaturally-occurring nucleotides. Examples of such analogs include,without limitation, phosphorothioates, phosphoramidates, methylphosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides,and peptide-nucleic acids (PNAs). A “subsequence” or “segment” refers toa sequence of nucleotides that comprise a part of a longer sequence ofnucleotides.

A “gene,” for the purposes of the present disclosure, includes a DNAregion encoding a gene product (see infra). The region can also includeDNA regions that regulate the production of the gene product, whether ornot such regulatory sequences are adjacent to coding and/or transcribedsequences. Accordingly, a gene can include, without limitation, promotersequences, terminators, translational regulatory sequences such asribosome binding sites and internal ribosome entry sites, enhancers,silencers, insulators, boundary elements, replication origins, matrixattachment sites and locus control regions.

“Gene expression” refers to the conversion of the information, containedin a gene, into a gene product. A gene product can be the directtranscriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisenseRNA, ribozyme, structural RNA or any other type of RNA) or a proteinproduced by translation of a mRNA. Gene products also include RNAs whichare modified, by processes such as capping, polyadenylation,methylation, and editing, and proteins modified by, for example,methylation, acetylation, phosphorylation, ubiquitination,ADP-ribosylation, myristilation, and glycosylation.

“Modulation” refers to a change in the level or magnitude of an activityor process. The change can be either an increase or a decrease. Forexample, modulation of gene expression includes both gene activation andgene repression. Modulation can be assayed by determining any parameterthat is indirectly or directly affected by the expression of the targetgene. Such parameters include, e.g., changes in RNA or protein levels,changes in protein activity, changes in product levels, changes indownstream gene expression, changes in reporter gene transcription(luciferase, CAT, β-galactosidase, β-glucuronidase, green fluorescentprotein (see, e.g., Mistili & Spector, Nature Biotechnology 15:961-964(1997)); changes in signal transduction, phosphorylation anddephosphorylation, receptor-ligand interactions, second messengerconcentrations (e.g., cGMP, CAMP, IP3, and Ca2+), and cell growth.

The term “complementary” means that one nucleic acid is identical to, orhybridizes selectively to, another nucleic acid molecule. Selectivity ofhybridization exists when hybridization occurs that is more selectivethan total lack of specificity. Typically, selective hybridization willoccur when there is at least about 55% identity over a stretch of atleast 14-25 nucleotides, preferably at least 65%, more preferably atleast 70%, at least about 75%, and most preferably at least 90%.Preferably, one nucleic acid hybridizes specifically to the othernucleic acid. See M. Kanehisa, Nucleic Acids Res. 12:203 (1984).

The term “exogenous” when used with reference to a molecule (e.g., anucleic acid) refers to a molecule that is not normally present in acell, but can be introduced into a cell by one or more genetic,biochemical or other methods. Normal presence in the cell is determinedwith respect to the particular developmental stage and environmentalconditions of the cell. Thus, for example, a molecule that is presentonly during embryonic development of muscle is an exogenous moleculewith respect to an adult muscle cell. An exogenous molecule cancomprise, for example, a functioning version of a malfunctioningendogenous molecule or a malfunctioning version of anormally-functioning endogenous molecule.

An exogenous molecule can be, among other things, a small molecule, suchas is generated by a combinatorial chemistry process, or a macromoleculesuch as a protein, nucleic acid, carbohydrate, lipid, glycoprotein,lipoprotein, polysaccharide, any modified derivative of the abovemolecules, or any complex comprising one or more of the above molecules.An exogenous molecule can be the same type of molecule as an endogenousmolecule, e.g., protein or nucleic acid (i.e., an exogenous gene),providing it has a sequence that is different from an endogenousmolecule. Methods for the introduction of exogenous molecules into cellsare known to those of skill in the art and include, but are not limitedto, lipid-mediated transfer (i.e., liposomes, including neutral andcationic lipids), electroporation, direct injection, cell fusion,particle bombardment, calcium phosphate co-precipitation,DEAE-dextran-mediated transfer and viral vector-mediated transfer.

By contrast, the term “endogenous” when used in reference to a moleculeis one that is normally present in a particular cell at a particulardevelopmental stage under particular environmental conditions.

The terms “identical” or percent “identity,” in the context of two ormore nucleic acids or polypeptides, refer to two or more sequences orsubsequences that are the same or have a specified percentage ofnucleotides or amino acid residues that are the same, when compared andaligned for maximum correspondence, as measured using a sequencecomparison algorithm such as those described below for example, or byvisual inspection.

The phrase “substantially identical” in the context of two nucleicacids, refers to two or more sequences or subsequences that have atleast 75%, preferably at least 80% or 85%, more preferably at least 90%,95% or higher nucleotide identity, when compared and aligned for maximumcorrespondence, as measured using a sequence comparison algorithm suchas those described below for example, or by visual inspection.Preferably, the substantial identity exists over a region of thesequences that is at least about 40-60 nucleotides in length, in otherinstances over a region at least 60-80 nucleotides in length, in stillother instances at least 90-100 nucleotides in length, and in yet otherinstances the sequences are substantially identical over the full lengthof the sequences being compared, such as the coding region of anucleotide for example.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

Optimal alignment of sequences for comparison can be conducted, e.g., bythe local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482(1981), by the homology alignment algorithm of Needleman & Wunsch, J.Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson& Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988), by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.), or by visual inspection [see generally,Current Protocols in Molecular Biology, (Ausubel, F. M. et al., eds.)John Wiley & Sons, Inc., New York (1987-1999, including supplements suchas supplement 46 (April 1999)]. Use of these programs to conductsequence comparisons are typically conducted using the defaultparameters specific for each program.

Another example of algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described in Altschul et al., J. Mol. Biol. 215:403-410 (1990).Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information. This algorithm involvesfirst identifying high scoring sequence pairs (HSPs) by identifyingshort words of length W in the query sequence, which either match orsatisfy some positive-valued threshold score T when aligned with a wordof the same length in a database sequence. T is referred to as theneighborhood word score threshold (Altschul et al, supra.). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when: the cumulative alignment scorefalls off by the quantity X from its maximum achieved value; thecumulative score goes to zero or below, due to the accumulation of oneor more negative-scoring residue alignments; or the end of eithersequence is reached. For identifying whether a nucleic acid orpolypeptide is within the scope of the invention, the default parametersof the BLAST programs are suitable. The BLASTN program (for nucleotidesequences) uses as defaults a word length (W) of 11, an expectation (E)of 10, M=5, N=4, and a comparison of both strands. For amino acidsequences, the BLASTP program uses as defaults a word length (W) of 3,an expectation (E) of 10, and the BLOSUM62 scoring matrix. The TBLATNprogram (using protein sequence for nucleotide sequence) uses asdefaults a word length (W) of 3, an expectation (E) of 10, and a BLOSUM62 scoring matrix. (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA89:10915 (1989)).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA90:5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

Another indication that two nucleic acid sequences are substantiallyidentical is that the two molecules hybridize to each other understringent conditions. “Bind(s) substantially” refers to complementaryhybridization between a probe nucleic acid and a target nucleic acid andembraces minor mismatches that can be accommodated by reducing thestringency of the hybridization media to achieve the desired detectionof the target polynucleotide sequence. The phrase “hybridizingspecifically to” or “specifically hybridizing to”, refers to thebinding, duplexing, or hybridizing of a molecule only to a particularnucleotide sequence under stringent conditions when that sequence ispresent in a complex mixture (e.g., total cellular) DNA or RNA.

The term “stringent conditions” refers to conditions under which a probeor primer will hybridize to its target subsequence, but to no othersequences. Stringent conditions are sequence-dependent and will bedifferent in different circumstances. Longer sequences hybridizespecifically at higher temperatures. Generally, stringent conditions areselected to be about 5° C. lower than the thermal melting point (Tm) forthe specific sequence at a defined ionic strength and pH. In otherinstances, stringent conditions are chosen to be about 20° C. or 25° C.below the melting temperature of the sequence and a probe with exact ornearly exact complementarity to the target. As used herein, the meltingtemperature is the temperature at which a population of double-strandednucleic acid molecules becomes half-dissociated into single strands.Methods for calculating the T_(m) of nucleic acids are well known in theart (see, e.g., Berger and Kimmel (1987) Methods in Enzymology, vol.152: Guide to Molecular Cloning Techniques, San Diego: Academic Press,Inc. and Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual,2nd ed., vols. 1-3, Cold Spring Harbor Laboratory), both incorporatedherein by reference. As indicated by standard references, a simpleestimate of the T_(m) value can be calculated by the equation:T_(m)=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1M NaCl (see e.g., Anderson and Young, “Quantitative FilterHybridization,” in Nucleic Acid Hybridization (1985)). Other referencesinclude more sophisticated computations which take structural as well assequence characteristics into account for the calculation of T_(m). Themelting temperature of a hybrid (and thus the conditions for stringenthybridization) is affected by various factors such as the length andnature (DNA, RNA, base composition) of the probe or primer and nature ofthe target (DNA, RNA, base composition, present in solution orimmobilized, and the like), and the concentration of salts and othercomponents (e.g., the presence or absence of formamide, dextran sulfate,polyethylene glycol). The effects of these factors are well known andare discussed in standard references in the art, see e.g., Sambrook,supra, and Ausubel, supra. Typically, stringent conditions will be thosein which the salt concentration is less than about 1.0 M Na ion,typically about 0.01 to 1.0 M Na ion concentration (or other salts) atpH 7.0 to 8.3 and the temperature is at least about 30° C. for shortprobes or primers (e.g., 10 to 50 nucleotides) and at least about 60° C.for long probes or primers (e.g., greater than 50 nucleotides).Stringent conditions can also be achieved with the addition ofdestabilizing agents such as formamide.

The term “detectably labeled” means that an agent (e.g., a probe) hasbeen conjugated with a label that can be detected by physical, chemical,electromagnetic and other related analytical techniques. Examples ofdetectable labels that can be utilized include, but are not limited to,radioisotopes, fluorophores, chromophores, mass labels, electron denseparticles, magnetic particles, spin labels, molecules that emitchemiluminescence, electrochemically active molecules, enzymes,cofactors, and enzyme substrates.

II. Overview

The present invention provides methods for efficiently identifying andcharacterizing genes that play important roles in cellular processessuch as aging and development, response to environmental challenges(e.g., injury or drug exposure), and pathologic processes. Specifically,the methods disclosed herein permit the rapid and economical generationof “libraries” of differentially expressed and low abundance sequenceslikely to play roles in pathogenesis and treatment of human disease.Importantly, the methods of the invention are well suited to use withvery small amounts of tissue. This permits comprehensive libraries to beproduced even when small amount of starting material is available.

The methods also include a process in which genes identified as beingpresent in low abundance and/or as being differentially expressed(“candidate genes”) are functionally validated. This validation processinvolves determining whether a candidate gene does in fact play afunctional effect in a cell by, for example, determining if modulationof expression of the candidate gene is correlated with an alteration ina cellular activity or cellular state in the cell in which expression ismodulated.

Certain methods are performed using double-stranded RNA interference(RNAi). In general, such methods involve introducing a dsRNA that isspecifically hybridizes to at least a segment of the candidate gene intoa reference cell or tissue into which the dsRNA is introduced and thendetermining whether interference with expression is associated withalteration of cellular activity or state. Detection of such analteration provides evidence that the candidate gene is correlated withthe particular cellular state or process under investigation.

However, methods other than RNAi can be utilized to functionallyvalidate candidate genes identified in the libraries. Such methodsinclude interference with gene expression by use of antisensetechnology, ribozymes and gene knock-out approaches. Additionalapproaches include co-immunoprecipitation and epistasis investigations.

III. Preparation of Libraries

Generally

In one aspect of the invention, cDNA libraries are prepared that arehighly enriched for gene sequences likely to play a role in themolecular and cellular mechanisms of disease, or which are involved inother important cellular processes. In one embodiment of the invention,four related, or “cognate,” libraries are prepared and selectedsequences analyzed. Although, in some embodiments of the invention,fewer than four libraries are prepared, by screening multiple (e.g.,four) libraries the coverage of the transcriptome is maximized and thelikelihood of identifying low-abundance and differentially-expressedgenes is increased. Moreover, by preparing four libraries validationtechniques, as described infra are facilitated.

Tissue Sources

The libraries of the invention are prepared using mRNA from pairs oftissues that are of the same type, but which differ in one majorcharacteristic, such as disease state (e.g., diseased & normal braintissue), age (e.g., adult and fetal liver tissue), exposure to drugs,state of differentiation, stage of development, or other state (e.g.,stimulated & unstimulated; activated & unactivated). The tissue sourcemay be human or non-human. Typically the tissues are from a mammal suchas a human, non-human primate, rat, or mouse. In some embodiments, thetissues are from an animal or tissue culture model of a human disease,e.g., stroke, Alzheimer's disease, and neuropathy. Examples of tissuepairs useful for library preparation are shown in Table 1. TABLE 1Gene-expression state 1 Gene-expression state 2 Diseased tissue Normaltissue a) hypoxic/ischemic brain a) healthy brain b) cirrhotic liver b)healthy liver c) tumor c) normal tissue d) Alzheimer's brain d) healthybrain Drug-exposed tissue Non-drug exposed tissue a) kainate-injectedbrain a) saline injected brain b) Zyprexa ®-injected brain b) salineinjected brain c) toxin-stimulated cell c) saline stimulated cell lineline Age/Tissue Type/etc. Age/Tissue Type/etc. a) mature brain a) fetalbrain b) hippocampus b) cortex c) neurons c) glial cells

Although each of any group of four cognate libraries is prepared usingthe same tissue pair, the libraries have different properties as aresult of differences in their construction. For each set of libraries,one tissue in the pair is designated the “driver tissue,” “controltissue,” or simply “control cell” (from which “driver” cDNA may be made)and the second tissue in the pair is designated the “tester” tissue,“test tissue,” or simply “test cell” (from which “tester” cDNA may bemade). For example, in a pair in the same horizontal row of Table I),the tissue in the first column may be considered the tester and thetissue in the second column may be considered the driver. For purposesof the invention, it is entirely arbitrary which tissue is “driver” andwhich is “tester.”

For ease of reference, the four cognate libraries are referred to hereinas: (1) driver-normalized, (2) tester-normalized, (3) driver-subtracted,and (4) tester-subtracted. Libraries (1) and (2) are normalized, andthus enriched in sequences corresponding to low abundance transcripts.In a cognate group, Library 1 is made using one tissue of a pair (drivertissue) and Library 2 is made using the specified tester tissue.Libraries (3) and (4) are subtracted (or normalized and subtracted)libraries and thus enriched in sequences that are differentiallyexpressed between pairs of tissue states. Libraries (3) and (4) of acognate group are made using both tissues in the tissue pair.

Several methods are known for making normalized and/or subtracted cDNAlibraries. Although certain methods are described or referred to inSections II(B)-(E), infra, the invention is not limited to embodimentsin which these methods are used. For example, the analytical methodsdescribed in Section III may be used in combination with a variety ofnormalization/subtraction approaches.

Preparation of Double-Stranded cDNA From Paired Tissue Samples

Double-stranded cDNA (dscDNA) is prepared from tissues using standardprotocols, i.e., by reverse transcription of messenger (poly A⁺) RNAfrom a specified RNA source using a primer to produce single strandedcDNA. Methods for isolation of total or poly(A) RNA and for making cDNAlibraries are well known in the art, and are described in detail inAusubel and Sambrook (supra). In one embodiment, the library is madeusing oligo(dT) primers for first strand synthesis. The single-strandedcDNA is converted into double-stranded cDNA (dscDNA) using routinemethods (see, e.g., Ausubel supra).

Restriction Enzyme Digestion

In some embodiments of the invention, the dscDNA from each tissue sourceis digested with one restriction enzyme or, in an alternativeembodiment, the dscDNA from each tissue source is separately digestedwith two or more restriction enzymes, with different specificities, thatcut at recognition sequences found frequently in the dscDNA. Often, twoenzymes are used (and the discussion and examples below will refer touse of two enzymes). As noted, the digestion with each of the two ormore enzymes is carried out separately (e.g., in separate reactiontubes). The digested fragments may be combined later for furtherprocessing.

The dual digestion steps allow for the efficient generation of librariesthat are more comprehensive (e.g., containing more different species ofexpressed or differentially expressed species) than libraries made byother methods. The digestion is intended, in part, to generate fragmentsin a size range that allows efficient hybridization during the annealingsteps of library construction. Only fragments of the target size rangewill efficiently anneal under the conditions used, and non-annealingmolecules are excluded from amplification or cloning in some embodimentsof the invention. A further advantage of the dual digestion steps isthat by digesting with multiple (e.g., 2) enzymes with differentspecificities as taught herein, the resulting libraries are morecomprehensive.

According to the invention, the restriction enzymes used are selectedthat will produce a calculated (or “predicted”) average fragment size ofbetween about 100 and about 500 basepairs, preferably about 300-500basepairs (e.g., an average length of between 300 bases and 500 bases).In addition, the two or more different enzymes should produce fragmentsof similar lengths (e.g., so that each has a calculated average fragmentsize of within about 150 bases, more often about 100 bases, of thecalculated average fragment size of the other). Because PCR is generallymore efficient for shorter fragments, the use of fragments of similarlength also ensures non-biased PCR amplification between fragmentsresulting from digestion with different enzymes at subsequent steps inlibrary construction.

The calculated average fragment size produced by digestion of aparticular sample with a particular enzyme can be determined in avariety of ways. In one embodiment, a database of mRNA/cDNA sequencescorresponding to a selected class of mRNAs is used as a representativeproxy for the entire population of mRNAs of that class. One databasesuitable for this analysis is GenBank (accessible at, e.g.,http://www.ncbi.nlm.nih.gov/). Using this method, a set of mRNAsequences known to be expressed in a specified tissue (e.g., brain),organism (e.g., rat, human), or phylum (e.g., mammalia) are identified.Such identification can be easily accomplished because sequences indatabases such as GenBank are annotated, so that an investigator canselect sequences with particular properties. The frequency anddistribution of particular restriction enzyme recognition sites in theselected population of sequences is then determined, e.g., byinspection, but most conveniently by using a computer program such asGCG (Genetics Computer Group Inc., Madison, Wis.) or Sequencher (GeneCodes Corp, Ann Arbor, Mich.). In addition, the distribution ofrestriction sites in the population can be determined using publiclyavailable computer software, and enzymes that frequently cut atclustered sites identified; such enzymes are less desirable than thosethat recognize more evenly distributed sites.

Table II summarizes an experiment in which enzymes suitable for use withdscDNA prepared from rat mRNA were identified. To identify theseenzymes, a collection of 489 full-length rat mRNA/cDNA sequences wascollected from GenBank. The selected sequences from rat included a polyA-signal at 3′ end as well as an entire protein coding sequence (ORF)and at least 100 base pairs of 5′ UTR. The mRNAs sequences analyzed hadan average mRNA length of 2257 bases (and an average coding sequencelength 1509 bases and average 3′ untranslated region of 604 bases). Therestriction pattern predicted for digestion of this polynucleotide setwas determined using the GCG program described supra.

Exemplary enzymes for digestion of mammalian sequences include Alu I,Cvi RI, Dpn I, Hae III, Rsa I, Cvi J1 and Tha 1. As is apparent from thetable, most suitable enzymes recognize 4-base restriction sites and areblunt-cutters. As determined in the experiment summarized in Table II,preferred combinations of enzymes for construction of libraries frommammalian sequences are Dpn I and Rsa I, because they produce fragmentsof similar size in the desired size range. TABLE II Recognition Rec.sites/ Not Average Enzyme site mRNA cleaved size Alu I AGCT 13.07 0 175Cvi JI RGCY 51.89 0 47 Cvi RI TGCA 11.36 3 199 Dpn I GATC 07.17 13 319Hae III GGCC 13.23 0 216 Rsa I GTAC 05.21 24 424 Tha I CGCG 02.70 1711044

In alternative embodiments, the average fragment size can be determinedempirically. For example, average fragment size can be determined by PCRamplification of large number (e.g., at least 500) of clones from anormalized or subtracted library with vector-specific primers, followedby size determination of inserts on agarose gels.

As noted above, each restriction digestion is carried out separately(i.e., in a separate reaction vessel). Table III provides a flowchartillustrating the production of restriction digested dscDNA from a tissuepair using restriction enzymes Dpn 1 and Rsa 1. Parenthetical numbersare used to refer to specific products (i.e., reagents) produced or usedfor library production. TABLE III a) Dpn 1 digest (1) (normal) tissue →b) Rsa 1 digest (2) a) Dpn 1 digest (3) (diseased) tissue → b) Rsa 1digest (4)

In embodiments in which digestion is carried out with a single enzyme,any enzyme that would have been suitable as part of an enzyme pair maybe used (e.g., Dpn 1 or Rsa 1).

Addition of Adaptors

According to the invention, the digested fragments (e.g., digests 1-4 inTable III) are divided into two aliquots and each aliquot is ligated toan adaptor oligonucleotide, i.e., the first aliquot is ligated to afirst adaptor and the second aliquot is ligated to a second adaptor. Theadaptors used are usually designed to create a 22 to 40 base upperstrand hybridized to a 8-12 base lower strand (i.e., partiallydouble-stranded). Adaptors are ligated to dscDNA fragments using methodswell known in the art. For example, unphosphorylated oligonucleotidesmay be ligated to dscDNA fragments in a standard ligation reaction(e.g., a buffered mixture containing adaptors, fragments, 0.3 mM ATP andT4 DNA ligase, incubated for 12 h at 14° C.).

The adaptors are designed according to the following criteria: 1) Theligation of the adaptor to the fragment should reconstitute therestriction enzyme recognition sequence for the restriction enzyme usedto produce the fragments; 2) The adaptor should have a sequencesufficiently long and complex to serve as targets for amplification bythe polymerase chain reaction (PCR), e.g., nested PCR. 3) The first andsecond adaptors should have different sequences so that a moleculecontaining both adaptor sequences at opposite ends of a fragment can bedifferentiated from a molecule containing the same adaptor sequence ateach end by PCR amplification using suitable primers.

Methods for preparation of normalized and subtracted libraries by usingadaptors suited to PCR amplification are known in the art and may bereferred to in the practice of the present invention. See, e.g., Strausand Ausubel, 1990, Proc. Natl. Acad. Sci. 87: 1889; and Diatchenko etal., 1996, Proc. Natl. Acad. Sci. 93:6025-30; see also U.S. Pat. No.5,759,822, all of which are incorporated herein by reference.

Exemplary adaptors are shown in Table IV, along with primer sets thatmay be used for PCR amplification: TABLE IV No first adaptor secondadaptor Corresponding primers 1* (SEQ ID NO:1) 5′- (SEQ ID NO:2) 5′-(SEQ ID NO:3) 5′- CTAATACGACTCACTATAGGGCTCGACTAATACGACTCACTATAGGGCAGCGTG CTAATACGACTCACTATAGGGC-3′;GCGGCCGCCCGGGCAGGT-3′ GTCGCGGCCGAGGT-3′ Nested PCR Primer 1: (SEQ IDNO:6) 5′- TCGAGCGGCCGCCCGGGCAGGT-3′; (SEQ ID NO:4) 5′- (SEQ ID NO:5) 5′-Nested PCR Primer 2: ACCTGCCCGG-3′ ACCTCGGCCG-3′ (SEQ ID NO:7) 5′-AGCGTGGTCGCGGCCGAGGT-3′ 2* (SEQ ID NO:8) 5′- (SEQ ID NO:10) 5′- (SEQ IDNO:12) 5′- TCGAGCGGCCGCCCGGGCAGGT-3′ AGCGTGGTCGCGGCCGAGGT-3′TCGAGCGGCCGCCCGGGCAGGT-3′ (SEQ ID NO:9) 5′- (SEQ ID NO:11) 5′- (SEQ IDNO:13) 5′- ACCTGCCCGG-3′ ACCTCGGCCG-3′ AGCGTGGTCGCGGCCGAGGT-3′*partially double-stranded.

Table V provides, in schematic terms, a flowchart illustrating theaddition of adaptors to the products of Table III. In the illustration,the first adaptor is designated “Adaptor A” or “Adaptor C,” and thesecond adaptor is designated “Adaptor B” or “Adaptor D,” with differentfirst and second adaptors being used for fragments produced usingdifferent restriction enzymes. Although pairs such as A and C or B and Dwill have different sequences at the end ligated to the fragment (sothat the appropriate restriction fragment is regenerated upon ligation),to the extent possible the adaptors are designed to share the samesequence, e.g., to facilitate subsequent PCR amplification. TABLE V i)adaptor A (1A) a) Dpn 1 digest (1) → ii) adaptor B (1B) (normal) tissue→ iii) adaptor C (2C) b) Rsa 1 digest (2) → iv) adaptor D (2D) i)adaptor A (3A) a) Dpn 1 digest (3) → ii) adaptor B (3B) (diseased)tissue → iii) adaptor C (4C) b) Rsa 1 digest (4) → iv) adaptor D (4D)

The adaptor-ligated fragments corresponding to each of the separatedigestion reactions can be, and typically are, combined beforeproceeding to the subsequent subtraction and normalization protocols.For example, referring to Table V, supra, 1A+2C, 1B+2D, 3A+4C, 3B+4D maybe combined if adaptors A and C and adaptors B and D differ only at the3′ end (in order to reconstitute the restriction site). However, ifdesired, the reactions may be combined at later stages, or,alternatively, they may be kept separate.

Production of Subtracted Libraries

Subtracted libraries (i.e., normalized-subtracted libraries) are used toidentify efficiently genes that are differentially expressed in a pairof tissues. Two subtracted libraries are produced, a “driver-subtracted”library and a “tester-subtracted library.” When the “tester tissue” isstimulated tissue and the “driver tissue” is unstimulated, the“driver-subtracted” library will be enriched for genes down-regulated bystimulation and the “tester-subtracted” library will be enriched forgenes up-regulated by stimulation.

Methods for normalization, substraction and simultaneous normalizationand subtraction are known (see, e.g., Ausubel §§5.8-5.9 and discussioninfra). In one embodiment, the normalized-subtracted libraries of theinvention are made essentially according to Diatchenko et al. supra. Inanother embodiment, the production of the normalized-subtractedlibraries includes the following steps:

First Annealing Step

The following mixtures of adaptor-free digested fragments andadaptor-linked fragments are prepared and annealing reactions carriedout (Table VI). The adaptor-free fragments are added in excess over theadaptor-linked fragments, e.g., at an about 20:1, 10:1, or 5:1 ratio.Multiple ratios can be used. TABLE VI driver-subtractedtester-subtracted Rxn 1) anneal 1A + 3 Rxn 5) anneal 3A + 1 Rxn 2)anneal 1B + 3 Rxn 6) anneal 3B + 1 Rxn 3) anneal 2C + 4 Rxn 7) anneal4C + 2 Rxn 4) anneal 2D + 4 Rxn 8) anneal 4D + 2

The mixture is heat-denatured and allowed to anneal, e.g., byheat-denaturation for 90 seconds at 99° C. followed by incubation at 68°C. to allow annealing in 1 M NaCl, 50 mM HEPES (pH 8.3) and 4 mMCetyltrimethylammonium bromide. Annealing is allowed to proceed tomultiple different Cot values by incubating samples or aliquots forvarying times (e.g., 4-12 h for a first sample and 10-24 h for secondsample). Hybridization to multiple Cot values results in a morecompletely normalized library and/or increases the likelihood ofenrichment of all differentially regulated genes. It will be recognizedthat in the annealing step, abundant sequences represented in theadaptor-ligated population will become double-stranded most rapidly, sothat, as to adaptor-ligated single-stranded molecules, the librarybecomes enriched for low-copy number molecules present in theadaptor-ligated population. When annealing to multiple Cots is carriedout, the products can be combined prior to the second annealing step,infra, or, alternatively, can be maintained separately throughout theamplification and optional cloning steps.

Second Annealing Step

The reactions mixtures of Table VI, supra, are combined and allowed toundergo a second hybridization step with excess (e.g., an about 20:1,10:1, or, 5:1 excess) freshly denatured driver (i.e., adaptor-freefragments), as shown in Table VII. TABLE VII driver-subtracted Rxn 9)products of Rxns 1 + 2 + additional denatured fragment 3* Rxn 10)products of Rxns 3 + 4 + additional denatured fragment 4tester-subtracted Rxn 11) products of Rxns 5 + 6 + additional denaturedfragment 1 Rxn 12) products of Rxns 7 + 8 + additional denaturedfragment 2*(see Tables III and VI)

Annealing is allowed to proceed to different Cot values by incubatingsamples or aliquots for various times (e.g. 4-20 h).

Amplification

After hybridization, PCR amplification is performed to isolate sequencesof interest. In general, only molecules carrying adaptors at both endscan be amplified exponentially by PCR. Other species carry one adaptorat one end and are amplified with linear kinetics, whereasnon-adaptor-ligated molecules are not amplified at all. Thus, the doubleadaptor-ligated population enriched in low-abundance or differentiallyexpressed genes is isolated by PCR amplification. Typically, PCRamplification is done in a 2-step protocol using nested primers for thesecond amplification.

Production of Normalized Libraries

Normalization is the process by which redundant clones in a library areremoved, without reducing the complexity of the library. Aftersuccessful normalization, approximately equal numbers of all expressedgenes are present in a library.

Typically normalization methods are based on reassociation kinetics ofre-annealing of nucleic acids in which denatured DNA is hybridized to anexcess amount of denatured complementary DNA. Because re-annealingnucleic acids follow approximately second-order kinetics, the mostabundant species form double-stranded hybrids most quickly. Thus, at anygiven Cot, rare or less abundant species will preferentially remainsingle stranded and abundant species will enter the population ofdouble-stranded molecules. Several methods are available fordistinguishing, separating, or differentially amplifying the singlestranded species. Exemplary normalization methods are found Soares etal., 1994, Proc Natl. Acad. Sci. 91:9228-32; Bonaldo et al., 1996,Genome Res. 6:791-806; and U.S. Pat. Nos. 5,637,685; 5,846,721;5,482,845; 5,830,662; 5,702,898; and Ausubel, supra.

In one embodiment, two normalized libraries (referred to as“tester-normalized and driver-normalized”) are produced. In oneembodiment, each normalized library is produced essentially according tothe protocol described in §F, supra, except that the driver and testerare identical. Thus, in one embodiment, the following reactions in TableVIII are carried out. TABLE VIII driver-normalized tester-normalizedRxn 1) anneal 1A + 1 Rxn 5) anneal 3A + 3 Rxn 2) anneal 1B + 1 Rxn 6)anneal 3B + 3 Rxn 3) anneal 2C + 2 Rxn 7) anneal 4C + 4 Rxn 4) anneal2D + 2 Rxn 8) anneal 4D + 4It will be appreciated that, if desired, reactions 1 and 2, 3 and 4, 5and 6, and 7 and 8 can be combined.IV. Optimized Selection of Species for Further Analysis

For each library produced, further analysis is carried out to identifysequences likely to be of particular interest. These include genes inthe low abundance classes from normalized libraries and differentiallyexpressed genes.

The combination of screening both normalized as well asnormalized-subtracted libraries allows comprehensive analysis of theactual expression status of the material under investigation. Previousmethods for gene expression analysis operating on a large set of genes(cDNA arrays, oligonucleotide arrays), require the a priori knowledge ofthe genes under investigation and are considered to be “closed” systems.In contrast, the method disclosed herein combines high-throughputmethods for identification of rare or differentially expressed genes,but also permits analysis with no prior knowledge about the geneexpression changes expected. That is, the genes under investigation aregenerated by the method itself and are usually significantly morerelevant for the biological process than a preselected set of genes.

Generally

In one embodiment, the preferentially amplified or cloned products ofsubtraction, normalization or combination subtraction-normalizationmethods are obtained, as described above or by other methods ofnormalization and/or subtraction. The resulting cDNA (libraries) aresubcloned by ligation into a vector capable of propagation in abacterial or eukaryotic cell. Typically, the clones are propagated inbacterial cells. A number of suitable vectors and cloning methods areknown (see, e.g., Sambrook, and Ausubel, both supra), including “TA”cloning of PCR products (Stratagene, La Jolla, Calif.) or blunt-endligation into a vector of fragments following a fill-in reaction usingT4 DNA polymerase and dNTPs.

Further analysis is then carried out by propagating a large number ofclones (i.e., by growing a large number of colonies or plaquescontaining clones from the library(s)). Typically, at least about 5000clones, more often 10,000, sometimes 15,000 and frequently 25,000 clonesare propagated. Because of the large number of clones that are analyzed,it is most convenient and practical to grow clones in multiwell plates(e.g., 384-well plates), using robotic means for growing and pickingcolonies. Suitable means are known in the art and are described at,e.g., Nguyen et al., 1995, Genomics 29:207-216. Alternatively, largenumbers of clones can be grown and picked manually.

The insert (i.e., cloned sequences) from each of the clones is isolatedand positioned on an array for further analysis. That is, the insertDNAs are immobilized at identified positions in a matrix suitable forhybridization analysis. In one embodiment, high-density filter arrays(HDFA) containing up to 12,000 PCR products per 8×12 cm membrane areused (Nguyen et al, supra). Alternatively, sequences may “printed” ontoglass plates, as is described generally by Schena et al., 1995, Science270:467-470.

Most conveniently, the insert corresponding to each clone is amplifiedby PCR using vector specific primers for spotting on the array. However,other approaches can be used. For example, DNA from each clone can beisolated, the DNA can be digested with a restriction enzyme(s) that cutsat the boundary of the vector and insert, and the insert sequence can beisolated and spotted on the array.

The arrayed sequences are then probed with labeled cDNA derived from“driver” (e.g., unstimulated) tissue or “tester” (e.g., stimulated)tissue. Labeled probes can be prepared using methods known in the art,e.g., by reverse transcription of isolated RNA from the driver andtester tissues in the presence of radiolabeled or fluorescently-labelednucleotides (see, e.g., Ausubel, supra; Kricka, 1992, Nonisotopic DNAProbe Techniques, Academic Press San Diego, Calif.; Zhao et al., 1995,Gene 156:207; Pietu et al., 1996 Genome Res. 6:492). Alternative methodsfor preparing probes, e.g., riboprobes, are well known and their use iscontemplated in some embodiments of the invention.

Optimal hybridization conditions for probing will depend on the type ofarray (e.g., filter, slide, etc.) selected, the method of labelingprobe, and other factors. Hybridization is carried out under conditionsof excess immobilized (arrayed) nucleic acid. General parameters forspecific (i.e., stringent) hybridization conditions for nucleic acidsare described in Sambrook and Ausubel. Suitable hybridization conditionsfor probing high density arrays are provided in Shena et al., 1996,Proc. Natl. Acad. Sci. USA, 93:10614, and Nguyen, supra.

When fluorescently labeled probes are used, the fluorescence emissionsat each site of a transcript array are detected (e.g., by scanningconfocal laser microscopy or laser illumination, see, e.g., Shalon etal., 1996, Genome Research 6:639-645; Schena et al., 1996, Genome Res.6:639-645; Ferguson et al., 1996, Nature Biotech. 14:1681-1684). Whenradiolabeled probes are used, autoradiography or quantitative imagingsystems (e.g., FUJIX BAS 1000 (Fugi)) may be used. See Nguyen et al.,supra, and references cited therein. When it is desirable to determinethe ratio of hybridization of two or more probes to the same set ofclones, multiple copies of a specific array can be prepared, separatelyprobed, the hybridization intensity be determined for each clone, and aratio determined. Alternatively, a single array can be repeatedlyprobed, with washing steps between hybridizations. When differentlylabeled (e.g., fluorescently-labeled) probes are used, multiple (e.g.,2) differently labeled probes may be simultaneously hybridized to thesame matrix (e.g., rhodamine-labeled driver cDNA and fluorescein-labeleddriver cDNA), and, for any particular hybridization site on thetranscript array, a ratio of the emission of the two fluorophores can becalculated from simultaneous hybridization to the same array.

One goal of the hybridization is to identify clones corresponding tomRNAs expressed at low abundance in driver and tester tissues,particularly clones corresponding to differentially expressed sequences.In the case of normalized libraries, both driver-normalized andtester-normalized libraries are probed with labeled cDNA from the tissuefrom which they are derived, as indicated in Table IX. Because thesignal intensity for any arrayed clone will correspond to the abundanceof the corresponding mRNA in the tissue, clones with low intensitysignals (i.e., “low signal clones”) will correspond to low abundancetranscripts (i.e., mRNAs rare in the transcriptome). A “low intensitysignal” or “low signal clone” refers to a clone having a hybridizationsignal in the lowest (e.g., 1^(st) to 20^(th) percentile) or very lowest(e.g., 1^(st) to 5^(th) percentile) range in a ranking of a large number(e.g., 1000) of clone signals in the array. This mRNA class is believedto be enriched for sequences of pharmaceutical importance. TABLE IXArray Probe Selection driver-normalized labeled cDNA probe from selectlow signal clones library array driver tissue (e.g., stimulated tissue)tester--normalized labeled cDNA probe from select low signal cloneslibrary array tester tissue (e.g., unstimulated tissue)

There are several advantages to screening both the tester- anddriver-normalized libraries. Disease, drug exposure, and otherstimulation leads to changes in the overall composition of thetranscriptome as well as to transitions of genes from one abundanceclass into another. Thus, the identity of the expressed genes as well astheir expression levels will be different for the two tissues. Thesedifferences will be reflected in the composition of the two librariesboth because normalization is never complete (i.e., the resultinglibrary is never perfectly normalized) and, second, because lowabundance genes from one library are sometimes not found in the other.

In the case of the subtracted libraries (i.e., the driver-subtracted andtester-subtracted libraries), both are probed using labeled probes(e.g., cDNA probes) from both RNA sources (i.e. cDNA from driver tissuesand cDNA from tester tissues). The ratio between the signals obtained bytester and driver probes indicates the up-regulation or down-regulationof a given clone in response to a stimulus. Thus, probing bothdriver-subtracted and tester-subtracted libraries will identify allgenes that change in expression, either by up-regulation(tester-subtracted) or down-regulation (driver-subtracted). Typically,genes showing at least a 20% (1.2-fold) change are of interest, withgenes showing a 2-fold difference in expression considered to be ofparticular interest. Preferably, the genes show at least about a 3-fold,5-fold or 10-fold difference in expression. Clones exhibiting thesedifferences in expression, as detected by hybridization of differentprobes, are referred to as “high ratio” clones. TABLE X Array ProbeSelection driver-subtracted A. labeled cDNA probe Select a high ratio ofA:B (e.g., enriched for from driver tissue Optionally select clonessequences down- B. labeled cDNA probe where either A or B give regulatedin from tester tissue a low intensity signal stimulated tissue)tester-subtracted A. labeled cDNA probe Select a high ratio of B:A(e.g., enriched for from driver tissue Optionally select clonessequences up- B. labeled cDNA probe where either A or B give regulatedin from tester tissue a low intensity signal stimulated tissue)

The hybridization analysis described provides an efficient way forprioritizing clones of likely high pharmaceutical significance forfurther analysis. Selected clones are usually characterized by DNAsequencing and homology analysis. Genes derived from such normalizedlibraries are used as a representative, relevant and non-redundant genecollection of a particular tissue and a particular biological questionfor a variety of downstream applications. These genes can serve astargets for array analysis allowing one to quantitate gene expressionchanges in the same or other biological models and complement the genecollection identified by normalized-subtracted libraries. The analysisof a number of normalized libraries from a variety of central andperipheral tissues under different conditions of stimulation provides anavenue for the ultimate identification of all genes expressed in thespecies under investigation. In addition, it will be appreciated that,in some embodiments, the arrayed sequences are screened with otherprobes; for example, an array of sequences differentially expressed instroke vs. normal brain can be screened with cDNA probe made from mRNAof Alzheimer's Disease brain tissue.

“Knock-Down” Analysis

One advantage of the present method is that, among the genes selectedfor further analysis on the basis of hybridization, the level ofredundancy is low (i.e., the number genes that are repeatedly sequencedis low) and the percentage of novel genes detected (genes not previouslyreported in GenBank) is high.

In contrast, some prior art DNA libraries contain clones representing asmall number of parent genes comprise a large proportion of all theclones in the library. These highly represented (or highly redundant)genes are particularly common in non-normalized libraries, or inlibraries from less complex sources, such as specific sub-regions oftissue or cell lines. Random selection of genes from such a library foranalysis (e.g. sequencing) results in significant redundancy of effortand expense.

The “knock-down” methods of the invention can be used to further reduceredundancy both in the libraries described herein supra, and inlibraries prepared by altogether other means (including non-normalizedlibraries or libraries prepared from specific sub-regions of tissue orcell lines). The knock-down method is used to identify clones that areredundant in a library (i.e., clones generated from transcripts havingthe same sequence) so that the effort and expense of characterizing theredundant sequences is avoided.

According to the knock-down method, redundant sequences in the libraryare identified by “prior sampling.” That is, prior to the hybridizationanalysis described in Section III(A), supra, or the equivalent of suchhybridization, the DNA sequence is determined for representative numberof clones, usually at least 50, often between about 100 to about 400clones, and sometimes more, for example, about 1000 clones. Theseanalyzed clones are referred to as the “prior sample.” It is notnecessary to sequence the entire clone; rather only one, or optionallyboth, termini need be sequenced (e.g., typically at least about 50 basesare determined, more often between about 200 and 350 bases). Thesequences are analyzed, for example by BLAST searching (Altschul et al.,1990, J. Mol. Biol. 5:403-10). A redundant sequence will appear moreoften than average: For example, a BLAST-identified sequence appearingas more than 4% of the sample is considered redundant.

In one embodiment of the invention, a set of previously identified genesare included as “knock-down” (e.g., unlabeled) polynucleotide in the“knock-down” method, to identify and avoid further processing of clonesthat have already been characterized (e.g., sequenced).

If a particular clone or clones is found to be over-represented whencompared to other members of the library, DNA may be isolated from theclone(s) (e.g., by PCR amplification of the fragment or insert) andincluded as an unlabeled (e.g., blocking), or distinctly labeledpolynucleotide, during a hybridization of a labeled probe mixtureagainst an array of clones from the library, as described in SectionIII(A) supra. Typically the unlabeled or distinctly labeled “knock-down”polynucleotide is included at a concentration of about 5 to about 100ng/ml in the hybridization mixture, often from about 5 to about 40ng/ml. Other useful concentrations will be apparent to one of ordinaryskill following the guidance of this disclosure. The unlabeled ordistinctly labeled polynucleotides are referred to herein as“knock-down” polynucleotides. In one embodiment, a small number ofredundant genes (e.g., one to ten) appearing in the “prior sample” maybe included as “knock-down” polynucleotides. In another embodiment, manyor all genes appearing in the “prior sample” can be included as“knock-down” polynucleotides.

The included unlabeled (or distinctly labeled) “knockdown”polynucleotide will hybridize to complementary sequences in the labeledprobe mixture, reducing the amount of specific labeled probe speciesavailable for hybridization to the array. Comparison of the signal ofthe probe with and without the addition of knockdown polynucleotide willshow that the inclusion of the knock-down clone(s) reduces hybridizationsignals at particular sites on the matrix. The sites of reduced signalcorrespond to sequences that are represented in the set of “knock-down”polynucleotides (i.e., redundant sequences by frequency or knownsequences by prior sampling). Having identified such clones, a decisionmay be made not to further analyze (e.g., sequence) the clones, savingtime and effort.

Alternatively, when the “knock-down” polynucleotides are detectablylabeled (using a label that can be distinguished from the probe label),redundant clones will be identifiable by the presence of the distinctsignal at the matrix site. This requires an additional labeling step forthe “knock-down” polynucleotides and, in one embodiment, requires anadditional duplicate hybridization matrix or a measurement of thedistinct signal. This is similar to the effort of measuring the signalof the primary (non-knock-down) labeled probe with and without theinclusion of “knock-down” polynucleotides.

Alternately, redundant clones are identified by hybridization of singleclones against an array representing the library, rather than bysequence analysis as discussed supra. A redundant clone will appear morethan once, and more highly redundant clones will tend to appear morethan less redundant clones. Non-redundant clones will appear once. Inthis embodiment, duplications of the array allow testing of as manyindividual clones as desired to test their redundancy, and the decisionmay be made to not further analyze (e.g., sequence) the clones, savingtime and effort.

V. Analysis of Methods of Library Construction

cDNA libraries are a critical reagent used by biologists in the analysisof gene expression and function. Various methods have been used toproduce normalized and/or subtracted cDNA libraries (see, e.g., §IIsupra and Ausubel, supra). These methods are complex and entail numerousdifferent parameters (e.g., annealing times, polynucleotideconcentrations, primer choices, amplification conditions, and the like),all of which may affect library quality in sometimes unpredictable ways.However, the art lacks a convenient and economical method for evaluatingthe quality of normalized and/or subtracted cDNA libraries.

As used herein, the “quality” of a subtracted (or normalized-subtracted)library is assessed by the degree to which differentially expressedgenes are enriched in the library relative to non-differentiallyexpressed genes. As used herein, the “quality” of a normalized-library(e.g., a tester-normalized or driver-normalized library) is assessed bythe degree to which sequences in the library are present in the sameabundance.

The present invention provides methods for conveniently assessinglibrary quality. By comparing the quality of libraries made usingstarting RNA from the same source but made by using different methods,the superior method can be identified (by virtue of producing a higherquality library).

In one embodiment, the method involves making libraries from the sametester and driver RNA but varying parameters. Detectably labeled probeis made from DNA from each library, using standard methods (e.g., nicktranslation, Ausubel, supra). The resulting probes are hybridized to anarray of immobilized polynucleotides under conditions of specifichybridization.

Suitable polynucleotide arrays may be produced by any of a variety ofmethods, but typically are spotted onto glass slides or nylon membranes(e.g., Schena et al., 1995, Science 270:467470, and Zhao et al., 1995,Gene 156:207-213). The array is selected to contain at least somepolynucleotide sequences representing genes that are differentiallyexpressed in the tester RNA tissue compared to the driver RNA tissue.This may be accomplished generally in two different ways.

In one method, a reference library (e.g. a tester-subtracted library) isproduced from tester and driver RNA (e.g., as described supra).Typically, the tester and driver RNA used for preparation of thereference library is made from the same tissue sources as used for thelibraries to be assessed, although it will be appreciated that this isnot strictly necessary. The resulting library is cloned (e.g., byligation to a vector and transform of bacteria) and DNA corresponding toindividual clones prepared (e.g., by PCR amplification using vectorprimers). DNA from a plurality of the clones (typically at least 50,more often at least 100, more often at least 1000) is applied to asubstrate (e.g., glass slide) for hybridization as described infra. Theresulting cDNAs are spotted onto substrate (e.g. nylon or glass) and thesubstrate is treated to affix the cDNAs. The array will includedifferentially expressed sequences (reflecting the library from whichthe clones were prepared).

A second method for selection of genes can rely on publications forselection of genes previously reported to be expressed in the tester RNAat higher levels than the driver RNA. These can be identified by theirGenbank identifier number, and many can be ordered from commercialsources, and these can be amplified by gene specific primers with PCR.

The resulting arrays are then prehybridized, and hybridized with probedescribed supra. After hybridization (including appropriate washing),the degree of hybridization of each library to various immobilizedpolynucleotides is detected and compared (e.g., the detectable signal isquantitated). As shown in the Examples, and in FIGS. 2-4, the intensityof hybridization of the labeled probe to an immobilized polynucleotidein the array is indicative of the relative abundance of the probesequence in the library. For example, the more enriched a library is fora differentially expressed gene, the greater the intensity of thehybridization of probe from that library to the immobilized genesequence.

According to the invention, a higher quality library is identifiedbecause at least one differentially expressed sequence shows higherhybridization signal (compared to a library of lower quality). Moreoften, a higher quality library is characterized by a higherhybridization signal to a plurality of different differentiallyexpressed genes on the array, e.g., at least about 5, 10, 20 or 30sequences or at least about 5%, 10% or 50% of the genes on the arraythat are differentially expressed (i.e., show an at least 1.2-fold,preferably an at least 2-fold, often at least 3-fold difference inexpression between the tester and driver RNAs). If the differentiallyexpressed sequence is rare (i.e. expressed at a low level relative tothe average sequence expression level), the hybridization signal of therare sequence in the improved subtracted-normalized library willincrease relative to a tester-subtracted library. Conversely, if adifferentially expressed sequence is abundant (i.e. expressed at ahigher level relative to the average sequence expression level), thehybridization signal of the abundant sequence in the improvedsubtracted-normalized library will decrease relative to atester-subtracted library. Thus, the method provides for the detectionof rare clones that are differentially expressed between two conditions.

VI. Functional Analysis of Identified Genes

Once a gene has been identified as potentially correlated with aparticular cellular state or activity, the gene can be subjected to afunctional validation process to determine from a functional standpointwhether the gene plays a role in a particular cellular activity orestablishment of a cellular state. Such genes are referred to herein as“candidate genes.” Candidate genes can potentially be correlated with awide variety of cellular states or activities. Examples of such statesand activities include, but are not limited to, states related toexposure to certain stimuli (e.g., drugs, toxins, environmentalstimuli), disease, age, cellular differentiation and/or stage ofdevelopment.

In general, the term “functional validation” as used herein refers to aprocess whereby one determines whether modulation of expression of acandidate gene or set of such genes causes a detectable change in acellular activity or cellular state for a reference cell, which cell canbe a population of cells such as a tissue or an entire organism. Thedetectable change or alteration that is detected can be any activitycarried out by the reference cell. Specific examples of activities orstates in which alterations can be detected include, but are not limitedto, phenotypic changes (e.g., cell morphology, cell proliferation, cellviability and cell death); cells acquiring resistance to a priorsensitivity or acquiring a sensitivity which previously did not exist;protein/protein interactions; cell movement; intracellular orintercellular signaling; cell/cell interactions; cell activation (e.g.,T cell activation, B cell activation, mast cell degranulation); releaseof cellular components (e.g., hormones, chemokines and the like); andmetabolic or catabolic reactions.

In one particular embodiment, candidate genes generally correspond togenes expressed at low levels and/or genes that are differentiallyexpressed with respect to different cells (e.g., diseased cells versushealthy cells). Low level candidate genes are those whose mRNA is about20% or less of the total mRNA within a cell or a library preparedtherefrom. Preferably about 15% or less, more preferably about 10% orless, still more preferably about 5% or less, yet still more preferablyabout 1% or lesss, and most preferably about 0.1% or less. In someinstances, the low abundance genes are 1% or less of the total mRNA inthe cell or library prepared therefrom. Genes that are differentiallyexpressed are genes in which there is a detectable difference inexpression between the different cells/tissues being compared.Generally, this means that there is at least a 20% change, and in otherinstances at least a 2-, 3-, 5- or 10-fold difference. The differenceusually is one that is statistically significant, meaning that theprobability of the difference occurring by chance (the P-value) is lessthan some predetermined level (e.g., 0.05). Usually the confidence levelP is <0.05, more typically <0.01, and in other instances, <0.001. Bothlow abundance genes and differentially expressed genes can beidentified, for example, according to the methods disclosed supra insection IV.

A variety of options are available for functionally validating candidategenes identified according to the foregoing methods. One particularaspect of the present invention provides a high-throughput functionalvalidation, which generally involves using the transcriptome proceduredescribed herein. In this manner, once the expression of a gene isdetermined to correlate with a particular cellular state and/or cellularactivity, at least a partial clone of the gene is already available fromthe transcriptome in the form of plasmit containing T7/T3 promoter.Alternatively, a promoter can be added to such partial clone of thegene, e.g., using PCR approach.

Double-Stranded RNA Interference (RNAi)

As described in the following sections and in further detail in Examples4 and 5 infra, the current inventors have demonstrated that RNAitechnology is an effective approach for functionally validatingcandidate genes identified through the foregoing gene identificationmethods. As used herein, RNAi technology refers to a process in whichdouble-stranded RNA is introduced into cells expressing a candidate geneto inhibit expression of the candidate gene, i.e., to “silence” itsexpression. The dsRNA is selected to have substantial identity with thecandidate gene.

The mechanism by which dsRNA exerts its inhibitory effect is not fullyunderstood. However, researchers in the RNAi field currently believethat dsRNA suppresses the expression of endogenous genes by apost-transcriptional mechanism. Specificity in inhibition is importantbecause accumulation of dsRNA in mammalian cells can result in theglobal blocking of protein synthesis. This blockage appears to resultbecause even low doses of dsRNA (such as occasioned by viral infection,for example) can induce what is called the interferon response. It isbelieved that in some cases, this response leads to the activation of adsRNA-responsive protein kinase simply referred to as PKR. Followingactivation, PKR phosphorylates and inactivates EIF2α, thereby causingglobal suppression of translation, which in turn triggers cellularapoptosis. However, the present inventors have found that when AGYNB-010cells are used, there is a minor upregulation of IFN-β, with nosignificant global suppression of translation, which in turn results inno apoptosis.

The gene identification procedures set forth herein when coupled withRNAi technology enables high throughput analysis and validation of alarge number of genes for any particular cellular state or activity ofinterest. In general such methods initially involve transcribing anucleic acids containing all or part of a candidate gene into single- ordouble-stranded RNA. Sense and anti-sense RNA strands are allowed toanneal under appropriate conditions to form dsRNA. The resulting dsRNAis introduced into reference cells via various methods and the degree ofattenuation in expression of the candidate gene is measured usingvarious techniques. Usually one detects whether inhibition alters acellular state or cellular activity.

Nature of the dsRNA

The dsRNA is prepared to be substantially identical to at least asegment of a candidate gene. In general, the dsRNA is selected to haveat least 70%, 75%, 80%, 85% or 90% sequence identity with the candidategene over at least a segment of the candidate gene. In other instances,the sequence identity is even higher, such as 95%, 97% or 99%, and instill other instances, there is 100% sequence identity with thecandidate gene over at least a segment of the candidate gene. The sizeof the segment over which there is sequence identity can vary dependingupon the size of the candidate gene. In general, however, there issubstantial sequence identity over at least 15, 20, 25, 30, 35, 40 or 50nucleotides. In other instances, there is substantial sequence identityover at least 100, 200, 300, 400, 500 or 1000 nucleotides; in stillother instances, there is substantial sequence identity over the entirelength of the candidate gene, i.e., the coding and non-coding region ofthe candidate gene. Suitable regions of the gene include the 5′untranslated region, the 3′ untranslated region, and the codingsequence.

Because only sequence similarity between the candidate gene and thedsRNA is necessary, sequence variations between these two speciesarising from genetic mutations, evolutionary divergence andpolymorphisms can be tolerated. Moreover, as described further infra,the dsRNA can include various modified or nucleotide analogs.

Usually the dsRNA consists of two separate complementary RNA strands.However, in some instances, the dsRNA may be formed by a single strandof RNA that is self-complementary, such that the strand loops back uponitself to form a hairpin loop. Regardless of form, RNA duplex formationcan occur inside or outside of a cell.

The size of the dsRNA that is utilized varies according to the size ofthe candidate gene whose expression is to be suppressed and issufficiently long to be effective in reducing expression of thecandidate gene in a cell. Generally, the dsRNA is at least 10-15nucleotides long. In certain applications, the dsRNA is less than 20,21, 22, 23, 24 or 25 nucleotides in length. In other instances, thedsRNA is at least 50, 100, 150 or 200 nucleotides in length. The dsRNAcan be longer still in certain other applications, such as at least 300,400, 500 or 600 nucleotides. Typically, the dsRNA is not longer than3000 nucleotides. The optimal size for any particular candidate gene canbe determined by one of ordinary skill in the art without undueexperimentation by varying the size of the dsRNA in a systematic fashionand determining whether the size selected is effective in interferingwith expression of the candidate gene.

Synthesis of dsRNA

dsRNA can be prepared according to any of a number of methods that areknown in the art, including in vitro and in vivo methods, as well as bysynthetic chemistry approaches.

In vitro methods. Certain methods generally involve inserting thesegment corresponding to the candidate gene that is to be transcribedbetween a promoter or pair of promoters that are oriented to drivetranscription of the inserted segment and then utilizing an appropriateRNA polymerase to carry out transcription. One such arrangement involvespositioning a DNA fragment corresponding to the candidate gene orsegment thereof into a vector such that it is flanked by two opposablepolymerase-specific promoters that can be same or different.Transcription from such promoters produces two complementary RNA strandsthat can subsequently anneal to form the desired dsRNA. Exemplaryplasmids for use in such systems include the plasmid (PCR 4.0 TOPO)(available from Invitrogen). Another example is the vector pGEM-T(Promega, Madison, Wis.) in which the oppositely oriented promoters areT7 and SP6; the T3 promoter can also be utilized.

In a second arrangement, DNA fragments corresponding to the segment ofthe candidate gene that is to be transcribed is inserted both in thesense and antisense orientation downstream of a single promoter. In thissystem, the sense and antisense fragments are cotranscribed to generatea single RNA strand that is self-complementary and thus can form dsRNA.

Various other in vitro methods have been described. Examples of suchmethods include, but are not limited to, the methods described by Sadheret al. (Biochem. Int. 14:1015, 1987); by Bhattacharyya (Nature 343:484,1990); and by Livache, et al. (U.S. Pat. No. 5,795,715), each of whichis incorporated herein by reference in its entirety. Single-stranded RNAcan also be produced using a combination of enzymatic and organicsynthesis or by total organic synthesis. The use of synthetic chemicalmethods enable one to introduce desired modified nucleotides ornucleotide analogs into the dsRNA.

In vivo methods. dsRNA can also be prepared in vivo according to anumber of established methods (see, e.g., Sambrook, et al. (1989)Molecular Cloning: A Laboratory Manual, 2^(nd) ed.; Transcription andTranslation (B. D. Hames, and S. J. Higgins, Eds., 1984); DNA Cloning,volumes I and II (D. N. Glover, Ed., 1985); and OligonucleotideSynthesis (M. J. Gait, Ed., 1984, each of which is incorporated hereinby reference in its entirety).

Annealing Single-Stranded RNA.

Once the single-stranded RNA has been formed, the complementary strandsare allowed to anneal to form duplex RNA. Transcripts are typicallytreated with DNAase and further purified according to establishedprotocols to remove proteins. Usually such purification methods are notconducted with phenol:chloroform. The resulting purified transcripts aresubsequently dissolved in RNAase free water or a buffer of suitablecomposition.

dsRNA is generated by annealing the sense and anti-sense RNA in vitro.Generally, the strands are initially denatured to keep the strandsseparate and to avoid self-annealing. During the annealing process,typically certain ratios of the sense and antisense strands are combinedto facilitate the annealing process. In some instances, a molar ratio ofsense to antisense strands of 3:7 is used; in other instances, a ratioof 4:6 is utilized; and in still other instances, the ratio is 1:1.

The buffer composition utilized during the annealing process can in someinstances affect the efficacy of the annealing process and subsequenttransfection procedure. While some have indicated that the bufferedsolution used to carry out the annealing process should include apotassium salt such as potassium chloride (at a concentration of about80 mM), the current inventors have found that the use of bufferedsolutions that are substantially potassium free can provide improvedresults. As used herein the term “substantially potassium free” meansthat a potassium salt is not added to the buffer solution; as aconsequence, the potassium level is generally less than 1 μM, and moretypically less than 1 nM. In one aspect of the present invention, it hasbeen found by the current inventors that improved results can beobtained in some instances by using sodium chloride rather thanpotassium chloride in the annealing buffer solution. The sodium chlorideconcentration in the annealing buffer solution generally is at least 10mM, and generally in the range 20 mM to 50 mM. Surprisingly andunexpectedly, present inventors have also found that further improvedresults can be obtained using sodium chloride free (i.e., <1 nM ofsodium chloride) ammonium acetate at a concentration range of from about10 μM to about 50 mM.

For example, certain annealing reactions are conducted in a solutioncontaining 20 mM NaCl at 65° C. for 30 minutes, followed by cooling for15 minutes. Alternatively, the annealing solution contains 10 mM TRIS(pH 7.5) and 20 mM NaCl at 95° C. for 1 minute and then allowing thesolution to cool at room temperature overnight.

Once single-stranded RNA has annealed to form duplex RNA, typically anysingle-strand overhangs are removed using an enzyme that specificallycleaves such overhangs (e.g., RNAase A or RNAase T).

Introduction of dsRNA

Cells. Once the dsRNA has been formed, it is introduced into a referencecell, which can include an individual cell or a population of cells(e.g., a tissue, an embryo and an entire organism). The cell can be fromessentially any source, including animal, plant, viral, bacterial,fungal and other sources. If a tissue, the tissue can include dividingor nondividing and differentiated or undifferentiated cells. Further,the tissue can include germ line cells and somatic cells. Examples ofdifferentiated cells that can be utilized include, but are not limitedto, neurons, glial cells, blood cells, megakaryocytes, lymphocytes,macrophages, neutrophils, eosinophils, basophils, mast cells,leukocytes, granulocytes, keratinocytes, adipocytes, osteoblasts,osteoclasts, hepatocytes, cells of the endocrine or exocrine glands,fibroblasts, myocytes, cardiomyocytes, and endothelial cells. The cellcan be an individual cell of an embryo, and can be a blastocyte or anoocyte.

Certain methods are conducted using model systems for particularcellular states (e.g., a disease). For instance, certain methodsprovided herein are conducted with a neuroblastoma cell line that servesas a model system for investigating genes that are correlated withvarious neurological diseases. Examples of diseases that can be studiedwith this particular cell line include, but are not limited to,Alzheimer's disease, Parkinson's disease, brain tumor, epilepsy, stroke,especially ischemic stroke, and other neuro degenerative diseases.

One specific cell line is referred to by the present inventors as theAGYNB-010 cell line. This cell line is prepared as follows. Neuronalcells (ATCC CCL131) are passaged at least 30 times on media containing0.10 mg/L of Fe(NO₃)₃ and 4500 mg/L of glucose. Cells so prepared havebeen found to be sensitivity to oxygen-glucose deprivation (OGD),N-methyl-D-aspartate (NMDA) and β-amyloid. As such, this particular lineof cells serves as a useful model system for studying stroke (e.g.,ischemic stroke), Alzheimer's disease and other neurological disorders.Other cell lines can be utilized as model systems to study obesity andbrain tumor.

Delivery Options

A number of options can be utilized to deliver the dsRNA into a cell orpopulation of cells such as in a cell culture, tissue or embryo. Forinstance, RNA can be directly introduced intracellularly. Variousphysical methods are generally utilized in such instances, such asadministration by microinjection (see, e.g., Zernicka-Goetz, et al.(1997) Development 124:1133-1137; and Wianny, et al. (1998) Chromosoma107: 430-439).

Other options for cellular delivery include permeabilizing the cellmembrane and electroporation in the presence of the dsRNA,liposome-mediated transfection, or transfection using chemicals such ascalcium phosphate. A number of established gene therapy techniques canalso be utilized to introduce the dsRNA into a cell. By introducing aviral construct within a viral particle, for instance, one can achieveefficient introduction of an expression construct into the cell andtranscription of the RNA encoded by the construct.

If the dsRNA is to be introduced into an organism or tissue, gene guntechnology is an option that can be employed. This generally involvesimmobilizing the dsRNA on a gold particle which is subsequently firedinto the desired tissue. Research has also shown that mammalian cellshave transport mechanisms for taking in dsRNA (see, e.g., Asher, et al.(1969) Nature 223:715-717). Consequently, another delivery option is toadminister the dsRNA extracellularly into a body cavity, interstitialspace or into the blood system of the mammal for subsequent uptake bysuch transport processes. The blood and lymph systems and thecerebrospinal fluid are potential sites for injecting dsRNA. Oral,topical, parenteral, rectal and intraperitoneal administration are alsopossible modes of administration.

The composition introduced can also include various other agents inaddition to the dsRNA. Examples of such agents include, but are notlimited to, those that stabilize the dsRNA, enhance cellular uptakeand/or increase the extent of interference. Typically, the dsRNA isintroduced in a buffer that is compatible with the composition of thecell into which the RNA is introduced to prevent the cell from beingshocked. The minimum size of the dsRNA that effectively achieves genesilencing can also influence the choice of delivery system and solutioncomposition.

Quantity of dsRNA Introduced

Sufficient dsRNA is introduced into the tissue to cause a detectablechange in expression of the candidate gene (assuming the candidate geneis in fact being expressed in the cell into which the dsRNA isintroduced) using available detection methodologies such as thosedescribed in the following section. Thus, in some instances, sufficientdsRNA is introduced to achieve at least a 5-10% reduction in candidategene expression as compared to a cell in which the dsRNA is notintroduced. In other instances, inhibition is at least 20, 30, 40 or50%. In still other instances, the inhibition is at least 60, 70, 80, 90or 95%. Expression in some instances is essentially completely inhibitedto undetectable levels.

The amount of dsRNA introduced depends upon various factors such as themode of administration utilized, the size of the dsRNA, the number ofcells into which dsRNA is administered, and the age and size of ananimal if dsRNA is introduced into an animal. An appropriate amount canbe determined by those of ordinary skill in the art by initiallyadministering dsRNA at several different concentrations for example, forexample. In certain instances when dsRNA is introduced into a cellculture, the amount of dsRNA introduced into the cells varies from about0.5 to 3 μg per 10⁶ cells.

Detecting Interference of Expression

A number of options are available to detect interference of candidategene expression (i.e., to detect candidate gene silencing). In general,inhibition in expression is detected by detecting a decrease in thelevel of the protein encoded by the candidate gene, determining thelevel of mRNA transcribed from the gene and/or detecting a change inphenotype associated with candidate gene expression.

Various methods can be utilized to detect changes in protein levels.Exemplary methods include, but are not limited to, Western blotanalysis, performing immunological analyses utilizing an antibody thatspecifically binds to the protein followed by detection of complexformed between the antibody and protein, and activity assays, providedthe protein has a detectable activity. Similarly, a number of methodsare available for detecting attenuation of candidate gene mRNA levels.Such methods include, for example, dot blot analysis, in-situhybridization, RT-PCR, quantitative reverse-transcription PCR (i.e., theso-called “TaqMan” methods), Northern blots and nucleic acid probe arraymethods.

The phenotype of the cell can also be observed to detect a phenotypicalchange that is correlated with inhibition of expression of the candidategene. Such phenotypical changes can include, for instance, apoptosis,morphological changes and changes in cell proliferation as well as othercellular activities listed supra. Thus, for example, using theneuroblastoma cell line discussed above which serves as a model systemfor neurological disease studies, one can detect what effect, if any,interference of expression of the candidate gene has on the sensitivityto OGD, β-amyloid and NMDA, for example. If interference with expressionof a particular gene relieves one or more of these sensitivities, thentherapeutic methods can be developed which involve blocking expressionof such a gene. And if interference with expression of a particular geneincreases one or more of these sensitivities, then therapeutic methodscan be developed which involve activating expression of such a gene.

Alternative Functional Validation Protocols

Methods which combine the library preparation and RNAi techniquesdescribed above enables a large number of candidate genes to be analyzedin a high throughput format to determine if the genes play a role in aparticular biological state or activity. However, the librarypreparation methods provided herein can successfully be used incombination with other functional validation approaches, as well.Examples of such approaches follow.

Antisense

Antisense technology can be utilized to functionally validate acandidate gene. In this approach, an antisense polynucleotide thatspecifically hybridizes to a segment of the coding sequence for thecandidate gene is administered to inhibit expression of the candidategene in those cells into which it is introduced. Methods relating toantisense polynucleotides are well known, see e.g., Melton, D., Ed,1988, ANTISENSE RNA AND DNA, Cold Spring Harbor Laboratory, Cold SpringHarbor, N.Y.; Dagle et al., 1991, Nucleic Acids Research, 19:1805; andUhlmann et al., Chem. Reviews, 90:543-584 (1990).

In general, the antisense polynucleotide should be long enough to form astable duplex but short enough, depending on the mode of delivery, to beadministered in vivo, if desired. The minimum length of a polynucleotiderequired for specific hybridization to a target sequence depends onseveral factors, such as G/C content, positioning of mismatched bases(if any), degree of uniqueness of the sequence as compared to thepopulation of target polynucleotides, and chemical nature of thepolynucleotide (e.g., methylphosphonate backbone, peptide nucleic acid,phosphorothioate), among other factors. Typically, the antisensepolynucleotides used in the functional validation methods comprise anantisense sequence of that usually is at least about 10 contiguousnucleotides long, in other instances at least 12 or 14 contiguousnucleotides long, and in still other instances up to about 100contiguous nucleotides long, which sequence specifically hybridizes to asequence from a mRNA encoding the candidate gene.

In some instances, the antisense sequence is complementary to relativelyaccessible sequences of the candidate gene mRNA (e.g., relatively devoidof secondary structure). This can be determined by analyzing predictedRNA secondary structures using, for example, the MFOLD program (GeneticsComputer Group, Madison Wis.) and testing in vitro or in vivo as isknown in the art. Another useful method for optimizing antisensecompositions uses combinatorial arrays of oligonucleotides (see, e.g.,Milner et al., 1997, Nature Biotechnology 15:537). The antisense nucleicacids (DNA, RNA, modified, analogues, and the like) can be made usingany suitable method for producing a nucleic acid, such as chemicalsynthesis and recombinant methods that are well known in the art.

Gene Knockout Approaches

The functional role that a candidate gene plays in a cell can also beassessed using gene “knockout” approaches in which the candidate gene isdeleted, modified, or inhibited on either a single or both alleles. Thecells or animals can be optionally be reconstituted with a wild-typecandidate gene as part of a further analysis.

Certain “knockout” approaches are based on the premise that the level ofexpression of a candidate gene in a mammalian cell can be decreased orcompletely abrogated by introducing into the genome a new DNA sequencethat serves to interrupt some portion of the DNA sequence of thecandidate gene. To prevent expression of functional protein, simplemutations that either alter the reading frame or disrupt the promotercan be suitable. A “gene trap insertion” can be used to disrupt acandidate gene, and embryonic stem (ES) cells (e.g., from mice) can beused to produce knockout transgenic animals (see, e.g., in Holzschu(1997) Transgenic Res 6: 97-106).

The insertion of the exogenous sequence is typically by homologousrecombination between complementary nucleic acid sequences. Thus, theexogenous sequence is some portion of the candidate gene which one seeksto modify, such as exonic, intronic or transcriptional regulatorysequences, or any genomic sequence which is able to affect the level ofexpression of the candidate gene; or a combination thereof. Theconstruct can also be introduced into other (i.e., non-candidate gene)locations in the genome. Gene targeting via homologous recombination inpluripotential embryonic stem cells allows one to modify precisely thecandidate gene of interest.

The exogenous sequence is typically inserted in a construct, usuallyalso with a marker gene to aid in the detection of the knockoutconstruct and/or a selection gene. The construct can be any of a varietyof expression vectors, plasmids, and the like. The knockout construct isinserted in a cell, typically an embryonic stem (ES) cell, using avariety of established techniques. As noted above, the insertion of theexogenous DNA usually occurs by homologous recombination. The resultanttransformed cell can be a single gene knockout (i.e., only one of thetwo copies of the candidate has been modified) or a double gene knockout(i.e., both copies of the candidate gene has been modified).

Typically less than one to five percent of the ES cells that take up theknockout construct actually integrate exogenous DNA in these regions ofcomplementarity; thus, identification and selection of cells with thedesired phenotype is usually necessary. This can be accomplished bydetecting expression of the selection or marker sequence describedabove. Cells that have incorporated the construct are selected for priorto inserting the genetically manipulated cell into a developing embryo.A variety of selection and marker techniques are well known in the art(e.g., antibiotic resistance selection or beta-galactosidase markerexpression). Alternatively, insertion of the exogenous sequence andlevels of expression of the endogenous candidate gene ormarker/selection genes can be detected by hybridization or amplificationtechniques or by antibody-based assays.

After selection of manipulated cells with the desired phenotype (i.e.,complete or partial inability to express the candidate gene), the cellsare inserted into an embryo (e.g., a mouse embryo). Insertion can beaccomplished by a variety of techniques, such as microinjection, inwhich about 10 to 30 cells are collected into a micropipet and injectedinto embryos that are at the proper stage of development to integratethe ES cell into the developing embryonic blastocyst, at about the eightcell stage (for mice, this is about 3.5 days after fertilization). Theembryos are obtained by perfusing the uterus of pregnant females. Afterthe ES cell has been introduced into the embryo, it is implanted intothe uterus of a pseudopregnant foster mother, which is typicallyprepared by mating with vascectomized males of the same species. Inmice, the optimal time to implant is about two to three dayspseudopregnant. Offspring are screened for integration of the candidategene. Offspring that have the desired phenotype are crossed to eachother to generate a homozygous knockout. If it is unclear whethergermline cells of the offspring have modified candidate gene, they canbe crossed with a parental or other strain and the offspring screenedfor heterozygosity of the desired trait.

Further guidance regarding preparation of mice that have a knocked outcandidate gene is provided in the following sources, for example:Bijvoet (1998) Hum. Mol. Genet. 7:53-62; Moreadith (1997) J. Mol. Med.75:208-216; Tojo (1995) Cytotechnology 19:161-165; Mudgeff (1995)Methods Mol. Biol. 48:167-184; Longo (1997) Transgenic Res. 6:321-328;U.S. Pat. No. 5,616,491 (Mak, et al.); U.S. Pat. Nos. 5,464,764;5,631,153; 5,487,992; 5,627,059; 5,272,071; and, WO 91/09955, WO93/09222, WO 96/29411, WO 95/31560, and WO 91/12650.

Ribozymes

Ribozymes can also be utilized to inhibit expression of candidate geneexpression in a cell or animal. Useful ribozymes can comprise 5′- and3′-terminal sequences complementary to the candidate gene and can beengineered by one of skill on the basis of the sequence of the candidategene. Various types of ribozymes can be utilized in the functionalvalidation studies, including, for example, those that havecharacteristics of group I intron ribozymes (see, e.g., Cech, 1995,Biotechnology 13:323) and those that have the characteristics ofhammerhead ribozymes (see, e.g., Edgington, 1992, Biotechnology 10:256).

Ribozymes and antisense polynucleotides can be delivered by a number oftechniques known in the art, including liposomes, immunoliposomes,ballistics, direct uptake into cells, and the like (see, e.g., U.S. Pat.No. 5,272,065).

Co-Immunoprecipitation

Co-immunoprecipitations can be used to functionally validate the role ofa protein in a pathway. If two proteins interact and antibodies areavailable, co-immunoprecipitations can be used to quickly confirm theirrole in a pathway.

Alternative Methods for Identifying Candidate Genes

While the functional validation methods (e.g., RNAi methods) disclosedherein have been discussed primarily with respect to candidate genesidentified from subtractive and/or normalized libraries preparedaccording to the methods described supra, it should be understood thatthese functional validation procedures can be utilized to functionallyvalidate genes that have been identified by any of a number of othermethods. For example, the functional validation procedures (e.g., RNAimethods) provided herein can be used to functionally validate lowabundance genes and differentially expressed genes identified usingother techniques.

These techniques include, but are not limited to, (i) differentialdisplay PCR (see, e.g., U.S. Pat. Nos. 5,262,311; 5,5599,672; and Liang,P. and Pardee, A. B., (1992) Science 257:967-971); (ii) nucleic acidprobe arrays (see, e.g., WO 97/10365; WO 97/27317; and the entiresupplement of Nature Genetics, vol. 21 (1999)); (iii) QuantitativeRT-PCR (see, e.g., U.S. Pat. Nos. 5,210,015; 5,538,848; and 5,863,736);(iv) dot blot analysis; (v) in situ hybridization (see, e.g., Harris, D.W. (1996) Anal. Biochem. 243:249-256; and Sanger, et al. (1986) Singer,et al. Biotechniques 4:230-250); (vi) differential screening methods(see, e.g., Tedder, T. F., et al. (1988) Proc. Natl. Acad. Sci. USA85:208-212); and (vii) other subtractive hybridization methods such asthose listed above (see, also, Sargent, T. D. (1987) Methods of Enzymol.152:423-432; and Lee, et al. (1991) Proc. Natl. Acad. Sci. USA88:2825-2830).

The following examples are provided solely to illustrate certain aspectsof the methods that are disclosed herein and are not to be interpretedso as to limit the scope of the application in any way.

EXAMPLE 1 Use of “Knock-Down” Method

A microglia cell line was stimulated with lipopolysaccharide (LPS, 100ng/ml) and γ-interferon-γIFN-γ, 100 U/ml) in a culture dish. Stimulatedand unstimulated cells were harvested at 12 hours and atester-subtracted library prepared (SL18). In this specific case, thetester and driver dscDNAs were digested with Rsa I, and adaptor set 1(see Table IV, supra) was used for tester ligations. The first andsecond hybridizations were for 8 and 16 h, respectively. PCRamplification (primary PCR: 25 cycles, secondary PCR: 12 cycles) waswith primer set 1, and products were cloned in pCR 2.1. Primer set 1 isshown supra in Table IV.

To identify sequences useful in the knock-down protocol, randomly chosenclones were submitted for DNA sequencing and sequence results wereanalyzed using the BlastN algorithm. Of 134 sequences identified byBlastN there were a number of genes represented more than once. Fourunique genes were represented multiple times by 5, 5, 5, and 6 redundantclones, respectively, accounting for more than 15% of the BlastNidentified sequences. “Knock-down” hybridization matrix analysisproceeded with using these genes as “knock-down” polynucleotides.Another 6,000 colonies from the library was picked, and amplifiedinserts were arrayed on nylon membranes in triplicate. Membranes wereeach hybridized to ³²P-labeled tester and driver cDNAs under stringentconditions, signal intensities analyzed by phosphoimaging and ratios ofsignal intensities calculated. “Knock-down” of labeled tester cDNAshybridization signal intensity was accomplished by inclusion ofunlabeled “knock-down” polynucleotides during probe denaturation priorto hybridization. As shown in FIG. 1, inclusion of the knock-downpolynucleotides resulted in a reduction in signal for redundant clones.In this library, “knock-down” analysis identified 610 clones asredundant, and further analysis (e.g., sequencing) of these genes wasthus avoided.

Clones showing at least a 2-old difference in signal intensities betweentester and driver were selected for DNA sequencing and further analysis.Out of the 6,000 original clones in the library, for SL 18 a total of384 differentially regulated clones were identified. The results ofsequence analysis of these clones up-regulated by LPS/IFN-γ, is shown inTable XI: TABLE XI* Library Known Genes Similar Genes Unknown Genes SL1852% 22% 26%*Gene classification is based on BlastN results using the most recentversion of Genbank as database. Genes are considered to be “known” ifthey display a high degree of similarity (>80% identity on nucleotidelevel)) to a database entry, as similar if they display a distantsimilarity (40-80% identity on nucleotide level) and as unknown if theydo not show any homology or an insignificant homology to a databaseentry.The identification, in this experiment, of redundant clones demonstratesthe utility of this method for efficient high-throughput analysis of alarge number of genes. In addition, the large number of unknown genesidentified is a further validation for the completeness of the analysis.

EXAMPLE 2 Knockdown Selection of Redundant Clones

A mouse microglial cell line known to respond to stimulation byincubation in media containing lipopolysaccharide (LPS) and gammainterferon (γIFN) was used. mRNA was purified from cells before(=driver) and after stimulation (=tester). A normalized and subtractedcDNA library was prepared and cloned in bacteria (“Library 1”).

For a representative number of clones (670), sufficient sequence wasdetermined to assign a Genbank identifier tag (GID) based on a BLASTcomparison. Clones matching a GID for MERANTES (GID X70675) were highlyrepresented in the sample (10 clones of 670, or approximately 1.5%). DNAcorresponding to the MERANTES sequence was amplified by PCR to produce“knockdown cDNA.”

Radiolabeled cDNA probes were prepared from approximately 0.5 microgramsof tester or driver mRNA. The knockdown cDNA was boiled 5 minutes,cooled on ice, and approximately 1 microgram was added to aliquots ofradiolabeled tester probe. Equivalent aliquots of radiolabeled testerprobe and driver probe were used without the addition of knockdown cDNA.The probe or probe/knockdown mixtures were incubated at 68° C. for 20minutes and hybridization solution 50% formamide, 5×SSC, 5× Denhardt'sreagent, 1% SDS, 0.025% sodium pyrophosphate) was added.

Each of the probe mixtures was hybridized to nylon membranes onto whichPCR-amplified cDNA prepared from the 670 partially sequenced clones fromLibrary 1 had been spotted. Hybridization was for 20 hours at 42° C. andwas followed by washing and signal detection.

Quantitation of the signal level of tester, knockdown-tester and driverhybridizations allowed the selection of clones upregulated by LPS andγIFN, based on their tester/driver ratios. Further, the signal ratio oftester/knockdown-tester allowed for the identification of clones thatmatch the knockdown cDNA. All 10 clones corresponding to MERANTES wereidentified by an elevated tester/knockdown-tester ratio, with an averagetester/knockdown-tester signal ratio of 6.4 fold (stdev 2.2). Incontrast, the average tester/knockdown-tester signal ratio for allclones was 1.38 (stdev 0.7). There was one clone withtester/knockdown-tester ratio above 3 fold that was not MERANTES. Theselection and effort of further handling of redundant clones (e.g.MERANTES) can be reduced by rejection of clones having an elevatedtester/knockdown-tester ratio (e.g. greater than 3)

EXAMPLE 3 Improved Method for Evaluating Quality of Normalized andSubtracted cDNA Libraries

A. Preparation of Tester and Driver ds cDNA

Human fibroblasts (ATCC CRL 2091) were grown to approximately 60%confluence in 15 cm Petri dishes in Dulbecco's Modified Eagle Medium(DMEM), 10% Fetal Calf Serum (FCS). The cells were washed 3 times withDMEM lacking FCS. After a 48 hour incubation in DMEM with 0.1% FCS themedium was replaced with fresh medium containing 10% FCS (serumstimulation). Cells were collected at two different time points. Onebatch of cells was collected just prior to serum stimulation (serumstimulated cells). This sample served as a time zero reference fromwhich “driver” RNA was prepared. Another batch was collected 6 hoursafter the addition of FCS. This sample served as a stimulated samplefrom which “tester” RNA was prepared (serum starved cells).

Total RNA from these samples was prepared using Trizol (LifeTechnologies). mRNA was selected using Oligotex Kit (Quiagen). The polyA⁺ RNA was reverse transcribed using an Oligo dT priming method andconverted into double-stranded cDNA (dscDNA) using standard methods.

B. Preparation of Normalized and Subtracted Libraries

The ds cDNA was digested with Rsa I (NEB). The Rsa I-digested tester anddriver ds cDNA were divided into two aliquots each, and each aliquot wasligated to an adapter oligonucleotide (Adapter set No. 1, shown in TableIV, supra). The ligation reaction was performed for 12 hours at 16° C.using T4 DNA Ligase (2000 U/μl).

Normalized-subtracted and normalized libraries were prepared asdescribed in § D and E, supra, respectively, using differenttester/driver ratios and different conditions for the two annealingsteps, as summarized in the table below Annealing time Annealing timeLibrary Ratio (First annealing (Second annealing ID Library DescriptionTester/driver step) step) A Driver-Normalized 1:5  9 hours 18 hours BTester-Normalized 1:5  9 hours 18 hours C Normalized-Subtracted, 1:5  9hours 18 hours Tester-Subtracted D NORMALIZED- 1:15  9 hours 18 hoursSUBTRACTED, Tester-Subtracted E NORMALIZED- 1:10  9 hours 18 hoursSUBTRACTED, Tester-Subtracted F NORMALIZED- 1:10 12 hours 18 hoursSUBTRACTED, Tester-Subtracted G NORMALIZED- 1:10 12 hours 36 hoursSUBTRACTED, Tester-Subtracted H NORMALIZED- 1:20  9 hours 18 hoursSUBTRACTED, Driver-Subtracted I NORMALIZED- 1:10  9 hours 18 hoursSUBTRACTED, Driver-Subtracted J NORMALIZED- 1:10 12 hours 18 hoursSUBTRACTED, Driver-Subtracted K NORMALIZED- 1:10 12 hours 36 hoursSUBTRACTED, Driver-Subtracted

Following annealing, a 2-step (nested) PCR amplification was performedto isolate sequences of interest. In the first PCR reaction onlymolecules which different adapter sequences on each end are amplifiedexponentially by the adapter-specific primer PCR1. The number of PCRcycles needed to obtain sufficient amounts of amplicon for analysisdepends on the experimental paradigm under investigation, and needs tobe determined empirically by performing the PCR amplification procedurewith different cycle numbers and analyzing amplicon yields (e.g., byagarose gel electrophoresis). In this analysis, different numbers of PCRcycles (21, 23, 25 and 27) were used for the first PCR amplificationwhereas the second, nested PCR amplification using nPCR1 and nPCR2 asprimers proceeded with 12 cycles for all samples.

PCR primer for first amplification: PCR1, (SEQ ID NO:14)CTAATACGACTCACTATAGGGC; PCR primer pair for second, nestedamplification: nPCR1, TCGAGCGGCCGCCCGGGCAGGT (SEQ ID NO:15) nPCR2,AGCGTGGTCGCGGCCGAGGT (SEQ ID NO:16)C. Evaluation of Library Qualityi) Array Preparation

Arrays can be prepared using various materials and protocols (forexamples, see Schena, Mark et al., “Quantitative monitoring of GeneExpression patterns with a complementary DNA microarray”, Science (1995)v270:467-470, and Zhao, Nanding et al., “High-Density cDNA FilterAnalysis: A Novel Approach for Large-Scale, Quantitative Analysis ofGene Expression”, Gene (1995) v156:207-213). An array can be comprisedof a large number of clonal cDNAs on a substrate. The cDNAs can beproduced by various methods, including purification of plasmids and PCRamplification. The cDNAs are commonly attached by treatment with heat,ultraviolet light, chemicals or enzymes, or by reaction with apreactivated surface. One typical array starts with the PCRamplification of 11520 bacterial clones containing cDNAs inserted into aplasmid. These clones are commonly from a normalized-subtracted libraryand therefor contain genes differential in tester and driver mRNAexpression levels. Aliquots of the PCR reactions are spotted onto nylonmembrane (Scheicher& Scheull) to produce the array. To this arrayvarious standard genes are added, the cDNA fragments are denatured bywetting the membrane in a solution of 0.5M sodium hydroxide, 1.5M sodiumchloride to allow better availability for hybridization, neutralized andcrosslinked by ultraviolet light (Stratalinker, Stratagene). Aparticular example of a cDNA array suitable for analysis of libraryproduction methods was prepared. Clones corresponding to 80 genes wereselected because their mRNA expression levels in fibroblasts varied uponstimulation by serum, based on cDNA microarray data as described inlyer, Vishwanath et al., 1999 Science v283:83-87, incorporated herein byreference in its entirety for all purposes. Recombinant clones werepurchased from Research Genetics and verified by DNA sequencing. ThecDNA insert of each clone was PCR-amplified using vector-specificprimers. PCR products were verified by gel electrophoresis. PCR productswere spotted in sextuplicate on nylon membranes.

-   -   ii) Probe Preparation

ds cDNA from each of libraries A-K described supra (i.e., the productsof the second PCR amplification) were gel purified using a QiaEx Gelpurification kit. The purified products were labeled with ³²P-dCTP(Klenow, Decamer labeling Kit, Ambion) and unincorporated nucleotideswere removed by spin column P30 (BioRad).

iii) Evaluation of Library Quality

The probes were hybridized to the cDNA arrays at 42° C. in 5×SSC/50%formamide for 20 hours. The hybridized arrays were washed in 0.1×SSC at60° C. and exposed to phosphorimager screens (Packard Instruments) forapproximately 64 h. Hybridization signal intensities were determined bya Cyclone scanner and Optiquant software (Packard Instruments),normalized by controls including genomic DNA standards, and comparisonswere made between serum-starved fibroblasts (=driver), serum-stimulatedfibroblasts (=tester) and different normalized and subtracted libraries.Signal intensity of filter hybridizations was used to determine theabundance of genes and gene fragments in the material used to make theprobe (see NUCLEIC ACID HYBRIDIZATION, A PRACTICAL APPROACH, pp. 21-22and 77-111, Hames BD and Higgins SJ eds., IRL Press (1985), and Kafatoset al., 1979, Nucleic Acids Research Res., 7, 1541).

Analysis of the quantified hybridization signal from the arrays allowedgrouping of the arrayed genes into several classes based on signalintensities after hybridization. These classes were called low, medium,or high signal levels (herein, corresponding to clones with approximatesignal levels of less than 5000 Digital Light Units or DLU=low,5000-16000 DLU=medium, greater than 16000 DLU, corresponding to theintensity of the original radioactive probe hybridized to each spot ofcloned cDNA on the array). The arrayed genes were also grouped intoclasses that increase, maintain, or decrease signal intensity (wereregulated in the amount of mRNA produced under condition of tester anddriver (e.g., serum-stimulation and serum-starvation). In this example,genes were considered up-regulated if the ratio of their tester/driversignals is greater than 2, genes are considered unchanged if the ratioof their tester/driver signals were greater than 0.85 and less than1.15, and genes were considered down-regulated if the ratio of theirdriver/tester signals is greater than 1:5. For example, gene could be oflow abundance in driver (i.e. low signal of hybridization, herein lessthan 5000 DLU) and upregulated (i.e. ratio of tester/driver signals isgreater than 2).

In FIGS. 2-4, selected clones within the different abundance classesillustrate the effect of condition group (Library ID) and PCR cyclelength (e.g., 21, 23, 25, or 27 cycles on the representation of theclone in the library. For reference, hybridization values for control(=driver) probe are marked Rsal, 0 h, and serum stimulated (=tester)probe are marked Rsal, 6 h are included in each graph.

This analysis allowed the determination of enrichment factors for eachclone represented on the cDNA array and each normalized and subtractedcDNA library. The enrichment factors describe the change in abundance ofa particular gene in normalized and subtracted cDNA libraries and areindicators for the success/quality of that library. The quality of anormalized-subtracted library is assessed by the degree to whichdifferentially expressed genes are enriched in the library. DuringTester-Subtracted subtraction, upregulated genes (of abundance higher intester than in driver) are increased in abundance in the resultinglibrary, and down regulated genes are decreased. During reversesubtraction, the reverse is true (e.g. down regulated genes areincreased in abundance in the resulting library). The data show thatparticular conditions (e.g. F25) can increase further the signal andabundance of low, medium and high abundance genes where their initialabundance are higher in tester than in driver.

The quality of a tester-normalized or driver-normalized library isassessed by the degree to which sequences in the library are present inthe same abundance, as assessed by a similar intensity of hybridizationto the arrayed clones. In a perfectly normalized library, all of thesequences represented are present in the same abundance. Normalizationof the abundance of clones gives a more equal chance of discovering whatwere initially abundant and non-abundant genes, saving time by reducingredundancy of the clone fragments. The data show that particularconditions (e.g. library B) can increase further the signal andabundance of low, medium and high abundance genes where their initialabundance are higher in tester than in driver.

The quality of a tester-subtracted normalized library is demonstrated byan increase in the occurrence of genes that are more abundant in testerthan in driver, a decrease in the occurrence of genes that are moreabundant in driver than tester, and the abundance of genes that remainin the library are normalized. This leads to an increase in theabundance of genes having a low abundance that are more prevalent intester than driver. The normalization will also decrease the redundancyof very abundant genes that are more prevalent in tester than driver.This effect of normalization will ease the discovery of genes morespecific to tester that are rare, and increase the efficiency ofidentifying all genes in the subtracted library. An equivalentassessment of quality can be made for a driver-subtracted normalizedlibrary.

EXAMPLE 4 Double-Stranded RNA Transfection Blocks eGFP Expression inNeuroblastoma Derived Cells

This experiment was undertaken to determine the level of gene specificsilencing that could be achieved in certain neuroblastoma cell linesusing RNAi techniques as described herein. The AGYNB-010 cell lineutilized in this particular investigation was derived from aneuroblastoma cell line called Neuro 2A (ATCC No. CCL131). As describedfurther below, the AGYNB-010 cell line was shown by the currentinventors to be sensitive to OGD, NMDA and β-amyloid relative to theNeuro 2A cell line. The sensitivities exhibited by the AGYNB-010 cellline makes the cell line a good model system for studying variousneurological and non-neurological conditions such as ischemia,excitotoxicity, Alzheimer's disease and oxidative stress because theseconditions are associated with the foregoing sensitivities. TheAGYNB-010 cell line were transfected with a green fluorescent protein(GFP) expressing plasmid to provide an assay system to determine thereduction in specific protein levels achieved by RNAi rapidly andquantitatively.

Materials and Methods

Generation of a neuroblastoma derived cell line expressing the enhancedGreen Fluorescent Protein (eGFP). Neuro 2A cells were grown in DMEM andthen plated in a six well plate at a concentration of 5×10⁵ cells/ml. Aplasmid expressing eGFP was obtained from Clontech (pEGFP-CI).Twenty-four hours after seeding the plates with Neuro 2A cells, thecells were co-transfected with 0.5 microgram of pCMVneo (available fromStratogene) and three microgram of pEGFP-CI. Forty-eight hours aftercotransfection, cells were transferred to media containing G418 toselect for transfected cells. Cells resistant to G418 were selected,tested for GFP by visualization with a light microscope, replated andindependent clonal lines established. The established cell line wasfurther tested for OGD, β-amyloid, and NMDA sensitivity according to theassays set forth below in this section.

High throughput RNA transcription. Single strands of sense andanti-sense RNA from the full length pEGFP clones were transcribed about500 bp of EGFP-C (i.e., about 500 bp of the C-terminus of the PEGFP) invitro using T3 and T7 promoters. Addition of SP6 polymerase results inthe transcription of sense RNA, and addition of T7 polymerase results inthe transcription of antisense RNA (Ambion). Transcripts were purifiedof proteins using phenol-chloroform extraction. RNA was precipitated byadding 20 microliters of 10 M ammonium acetate and 220 microliters ofisopropanol to 200 microliters of the extracted mix and then incubatingthe resulting mixture at −20° C. for 15 minutes. The mixture wascentrifuged and the RNA pellet dried and resuspended in 100 microlitersof RNAse free double distilled water. The concentration of RNA wasdetermined to be approximately 1 microgram/ml. The length of thetranscripts was typically 500 bases or more.

For use as control, dsRNA corresponding to the full length coding regionof UCP-2 (uncoupling protein 2) gene was prepared in a similar manner.

In vitro transcription can also be done in 96-well format using both T3and 17 promoter to generate sense and antisense strands. Purification ofthe transcripts is done using RNA purification columns, such as, but notlimited to, RNeasy kit (available from Qiagen). Annealing of bothstrands in the absence of potassium chloride or sodium chloride can beachieved using ammonium acetate, e.g., at about 10 μM to 1 mMconcentration. The reaction buffer is then adjusted to 500 mM of sodiumchloride before RNase T1 treatment. RNase T1 is added to degrade any nonannealed single-stranded RNA. The resulting products are passed throughRNA purification columns again to remove RNase T1. Concentration of thefinal dsRNA products can be measured using a plate reader.

Synthesis of Double-stranded RNA. Equimolar quantities of sense andantisense RNA strands from either eGFP or UCP-2 were added in a reactionsolution of annealing buffer; annealing of the sense and antisensestrands was carried out by incubation at 60° C. for thirty minutes andthen allowed to cool at room temperature. A variety of annealing bufferscan be used. For example, when an annealing solution containing 20 mMsodium chloride is used, the reaction mixture is heated incubated at 60°C. for thirty minutes and cooled for about 15 minutes to afford a dsRNA.Alternatively, the RNA can be added to 10 mM Tris (pH 7.5) buffercontaining 20 mM of sodium chloride. The mixture is incubated for 95° C.for about one minute and cooled at room temperature for about 12 to 16hours to afford a dsRNA. In another embodiment, the RNA is precipitatedin 1 M ammonium acetate solution and resuspended in double distilledwater. The mixture is then incubated at 60° C. for thirty minutes andcooled for about 15 minutes to afford a dsRNA. The latter buffersolution differs from annealing buffers used by others which containpotassium or sodium chloride. The approach described here also differsfrom other approaches in that incubation typically is only for 30minutes, whereas the others typically incubate the mixture overnight(see, e.g., Tuschel et al., Genes and Dev't, 1999, 13, 3191-3197)

Transfection of double-stranded RNA into cells. AGYNB-010 cells wereplated in six well plates at a density of 3-4×10⁵ cells/ml in DMEMcontaining 10% fetal bovine serum (Sigma). Twenty-four hours later, theAGYNB-010 cells were washed in serum free DMEM in preparation fortransfection. Two separate solutions were prepared: Solution A contained1-5 micrograms of double-stranded GFP RNA or control RNA (UCP-2 RNA) and100 micolitres of serum free DMEM. Solution B was prepared by dilutingLipofectamine (Gibco BRL) with serum free DMEM (9:1 ratio). Solution Aand B were gently mixed and incubated for 15 minutes at roomtemperature, then 0.8 ml of serum free DMEM was added to thetransfection mixture and this mixture overlayed on the washed cells.Care was taken to ensure that the final volume of the transfectionmixture overlayed on the cells did not exceed 1 ml. The cells wereincubated at 37° C. in a CO₂ incubator for 18-24 hours. The cells aredrained of the transfection mixture and replaced with fresh DMEMcontaining 10% FBS.

Measurement of the Level of Gene Specific Silencing

Direct fluorometry: Two days after transfection, 10⁶ AGYNB-010 cellstransfected with either eGFP dsRNA or UCP-2 dsRNA were seeded on a plateand the amount of green fluorescence quantitated by using a cytofluorplate reader (e.g., Series 4000, Perseptive Biosystem).

Western Blot analysis: Two to five days after transfection, cellextracts from AGYNB-010 cells transfected with either eGFP dsRNA orUCP-2 dsRNA were harvested in standard RIPA buffer and the total proteinconcentration determined using the BCA assay system from Pierce. Thirtymicrograms of total protein from the cell extracts was loaded per laneon an SDS-PAGE gel. The gel was transferred to nitrocellulose usingstandard western transfer procedures. GFP protein was detected usinganti-GFP from Chemicon. The level of microtubole associated protein-2(i.e., MAP2), a non specific protein, was used as a loading control.Anti-MAP2 was obtained from Sigma. The western blot was then scanned inand quantified using NIH image.

Methods for Detecting Various Cell Sensitivities

Oxygen-glucose deprivation (OGD). To measure the sensitivity of cells tocombined oxygen-glucose deprivation, cells were resuspended in glucosefree deoxygenated media (Earle's balanced salt solution (EBSS)containing 116 mM NaCl, 5.4 mM KCl, 0.8 mM MgSO₄, 1 mM NaH₂PO₄, and 0.9mM CaCl₂) bubbled with 5% H₂/85% N₂/5% CO₂. The cells were transferredto an anaerobic chamber for 5 or 60 min at 37° C., containing thefollowing gas mixture, 5% H₂, 85% N₂, and 5% CO₂. At the end of theincubation period, oxygen glucose deprivation was terminated simply byremoving the cells from the anaerobic chamber and replacing the EBSSsolution with oxygenated growth media. Sensitivity of the cells to OGDwas determined by measuring cell death. The cells were stained withcalcein and ethidium homodimer (Molecular Probes), which stains livecells and dead cells, respectively, the staining quantitated on acytofluor plate reader, and the percentage of dead cells determined. Onecan also use any of the other conventional methods known to one skilledin the art to determine cell health.

NMDA Sensitivity. Cells were washed with control salt solution (CSS)containing 120 mM NaCl, 5.4 mM KCl, 1.8 mM CaCl₂, 25 mM Tris-HCl, 15 mMglucose, pH 7.4. N-Methyl-D-aspartic acid (NMDA) was applied in CSS for5 min, and after this incubation time the NMDA solution was removed fromthe cells and growth medium. Toxicity was assayed 20-24 hrs. afterexposure to NMDA solution.

β-Amyloid Sensitivity. Cells were plated the day before exposure toeither β-amyloid or peroxide in a 24 well plate at a concentration of1×10⁵ cells/well. To measure sensitivity to β-amyloid, cells wereexposed to 1-50 μM β-amyloid for 24-72 hours using CSS solutiondescribed above for NMDA sensitivity test. β-Amyloid was made by firstsolubilizing it in DMSO or an aqueous solution and the resultingsolution then diluted in DMEM. In both instances, sensitivity wasassessed by measuring cell death using the staining procedure describedin the section on assays for OGD.

Results

FIG. 5 shows the results of a Western Blot analysis. Lanes 1 and 2 ofthe gel show eGFP and MAP2 protein levels for untransfected cells (i.e.,“mock” cells). However, lanes 6-8 show a significant reduction in eGFPlevels for AGYNB-010 cells transfected with 3 μg of eGFP-C dsRNA;likewise, cells transfected with 3 μg of enhanced green fluorescentprotein (i.e., eGFP) dsRNA also showed a significant reduction in eGFPlevels (lanes 9-10). The results demonstrate selectivity in inhibitionin that eGFP expression is inhibited by eGFP dsRNA but not UCP-2 dsRNA.The consistent bands for MAP2 across all lanes confirms consistency inprotein loading.

The AGYNB-010 neuroblastoma derived cell line was shown to be sensitiveto β-amyloid, NMDA and OGD as compared to Neuro 2A cells from which theAGYNB-010 cells are derived (see FIGS. 7A, 7B and 7C, respectively). Asindicated supra, these sensitivities mean this particular cell line canserve as a useful model for conducting studies of various biologicalphenomenon associated with such sensitivities. For instance, the cellline can be used in studying stroke (e.g., ischemic stroke), as strokeis associated with oxygen deprivation.

EXAMPLE 5 Double-Stranded PARP RNA Blocks Endogenous PARP Expression

Ischemic stroke results from transient or permanent reduction of thecerebral blood flow. Neuronal cells require high oxygen levels forviability and normal function. Deprivation of oxygen thus leads toneuronal death causing brain damage. In contrast, shorter exposures toischemia result in protection from neuronal damage, a phenomena known asischemic tolerance, or ischemic preconditioning. PARP(poly-ADP-ribose-polymerase) is a gene that is up-regulated in ischemia.Thus, PARP inhibitors or inhibition of PARP may have neuroprotectiveeffects. As demonstrated in Example 4, AGYNB-010 cells are sensitive tooxygen glucose deprivation and thus provide a sensitized system forstudying ischemia.

Transfection of dsRNA into cells. Single strands of sense and anti-senseRNA from the C terminus or N terminus of the PARP gene (NM_(—)007415,e.g., PARP-N 79-1171 and PARP-C 2200-2797 regions) or from UCP-2 ascontrol were transcribed, purified and concentrated according to thegeneral procedure set forth in Example 4. The single strands wereconverted to dsRNA and then transfected into AGYNB-010 cells also asdescribed in Example 4.

Measurement of the level of gene specific silencing. Cells transfectedwith UCP-2 dsRNA (dsUCP-2), PARP dsRNA from the C terminus (dsPARP-C),or PARP dsRNA from the N-terminus (dsPARP-N) were harvested and analyzedby Western blot according to the protocol described in Example 4. PARPprotein was detected using anti-PARP from Oncogene.

Assays for resistance to cell death. AGYNB-010 cells that wereuntransfected transfected with eGFP dsRNA (see Example 4) or transfectedwith PARP dsRNA, were assayed for their sensitivity to oxygen glucosedeprivation. Sensitivity was measured using the cell death assaydescribed in Example 4.

Results and Discussion

As observed in FIG. 6A, AGYNB-010 cells transfected with dsPARP RNA fromeither the C terminus (lanes 3-6) or N terminus (lanes 7-10) showsignificant reduction in endogenous PARP levels. Endogenous PARP levelsare not affected by transfection with dsUCP-2 (lanes 1-2), thusdemonstrating the ability to selectively inhibit a target gene byintroducing a dsRNA corresponding to the target gene.

The RNAi mediated inhibition of PARP also induces resistance to OGD asobserved by determining the cell death. FIG. 6B is a view showing thenumber of stained cells (i.e., healthy cells) present for cellstransfected with dsEGFP 3 hours after the start of oxygen glucosedeprivation. FIG. 6C, shows a similar view of cells similarly treated,except the cells are transfected with dsPARP. FIG. 6D is a chart showingthe same results as in FIGS. 6B and 6C. The chart also shows results fortwo controls: (1) the extent of cell death for cells not exposed to OGD,and (2) mock cells (i.e., untransfected cells) subject to 3 hours ofOGD. Collectively, these results show the ability of dsPARP to rescuecells having been previously subjected to 3 hours of OGD.

Thus, these functional validation results obtained by RNAi areconsistent with the gene expression data indicating that up-regulationof PARP is correlated with harmful cellular effects caused by ischemia.The results with the model system provided herein indicate thatinhibition of PARP can provide a neuroprotective effect, particularlyagainst ischemia. This makes PARP an attractive target for treatment ofstroke.

EXAMPLE 6 dsRNA to Inhibit Gene Expression

To demonstrate whether long dsRNA mediates gene-specific silencing inmouse neuroblastoma N2a cells, cell lines were generated expressing EGFPstably (N2a-EGFP) to mimic the expression of an endogenous gene. dsRNAcorresponding to the full-length (dsEGFP) or to the C-terminal part ofthe EGFP (dsEGFP-C) open reading frame was generated for transfectioninto these N2a-EGFP cells. Control dsRNA was made from the entire codingregion of the uncoupling protein-2 (UCP2). N2a-EGFP cells were plated 24hours before transfection. In most cases, 5 nM of the double-strandedRNA was transfected using Lipofectamine (Invitrogen) per manufacture'sinstructions. The cells were incubated in the LipofectAmine complexovernight, after which normal media was added and cells were incubatedfor 4 additional days. The fluorescence decreases significantly in cellstransfected with dsEGFP compared with those transfected with dsUCP-2 andmock transfected controls. Western blot shows that both dsEGFP anddsEGFP-C decrease the EGFP protein level significantly without affectingthe expression of the housekeeping gene MAP-2 (FIG. 8A). dsEGFPinhibited EGFP expression in a dose-dependent manner, with 5 nM showingthe maximum effect (FIG. 8B).

In this series of experiments, the reporter gene EGFP was used tofacilitate detection of expression. However, unlike other studies inwhich the expression plasmid coding for the reporter gene wascotransfected with dsRNA, dsRNA was transfected into a stable cell lineexpressing EGFP protein that mimics expression of endogenous genes. ThedsRNA-induced inhibitory effects we observed were clearly gene-specific.Control dsRNA corresponding to an unrelated gene show non-significantsuppression of EGFP expression compared with mock-transfected.Furthermore, cells remain alive and healthy with no significantapoptosis induced by transfection of double-stranded RNA. Theseexperiment also show that both long and short dsRNA has an effect onexpression.

Studies have shown that short, siRNA of 21 nt can induce efficientgene-specific silencing in mammalian cells. In N2a cells, it isdemonstrated herein that siRNA derived from FASTK ORF indeed inducedgene-specific silencing, confirming the efficacy of siRNA in theseneuronal cells. However, higher concentrations of siRNA appear to beneeded compared with long dsRNA, consistent with the current hypothesisthat dsRNA is processed into 21-23 nt siRNA before degrading the targetmRNA. Interestingly, we have shown that the level of inhibition of theEGFP expression is dependent on the amount of dsEGFP-RNA. Based on thecurrent understanding of dsRNA processing, long dsRNA may be processedinto 21-23 nt siRNA to induce degradation of target mRNA in theseneuronal cells, which may explain the higher efficiency of equalconcentrations of long dsRNA compared to siRNA.

Materials and Methods

Materials and methods are as described in Example 4, except for thefollowing:

Double-stranded RNA. The plasmid EGFP-C1 was used as template to producePCR fragment corresponding to the full-length coding region andC-terminal fragment of EGFP. The PCR fragments were then subcloned inPCRTOPO4.0 plasmid. For in vitro transcription, the plasmid waslinearized and transcribed with T3 and T7 RNA polymerase.Oligonucleotides used for PCR amplification for full-length EGFP ORF are(SEQ ID NO:17) ATGGTGAGCMGGGCGAGGAGCTG and (SEQ ID NO:18)TCTGAGTCCGGACTTGTACAGCTC. dsRNA-EGFP and EGFP-C were 727 and 620 long,respectively. For annealing, equal molar of sense and antisensetranscripts were heated at 60° C. for 3 minutes and cooled down at roomtemperature for more than 15 minutes. In addition, dsRNA preparation wastreated with RNase T1 to eliminate any remaining ssRNA. The quality ofthe dsRNA preparations was analyzed on 1.2% native agarose gel. Senseand anti-sense UCP2 were generated from RNA derived from UCP2 weregenerated using T7 and SP6 polymerase respectively. The template usedwas a PCR fragment. UCP2 ds-RNA was generated similarly as described.

EXAMPLE 7 dsRNA Inhibition of Endogenous Sequences

To test for RNAi-mediated gene silencing of endogenous genes, severaltypes of genes were chosen, including genes involved in apoptoticpathways such as caspase-3, p53, 14-3-3, and kinases such as MAP kinasep38, fas-activated serine threonine kinase (FASTK), and housekeepingenzymes such as Homo-Coenzyme synthase 1. For quantification of changesin gene expression after dsRNA transfection, RT-PCR and real time PCRwere used.

To generate dsRNAs corresponding to the above mouse genes, partialsequences of the mouse genes were cloned into PCR4.0TOPO (Invitrogene).The average length of the dsRNAs corresponding to the partial sequencesof the above genes is about 600-800 bps. The table below shows thepositions of the oligonucleotides used for cloning the PCR fragments,the length of each dsRNA, and the positions of the oligonucleotides usedfor quantification of the mRNA level after transfection. TABLE 1dsRNA-mediated inhibition of gene expression quantified with real-timePCR. Gene name Accession Number dsRNA Region % of knockdown¹ mCaspase 3NM_009810  2-723 87.3 + 4.34 mFAST NM_023229  67-640 65.8 + 7.34 mp53M13873 102-887 65.7 + 7.68 mcoenzyme A BC013443 174-583 73.7 + 9.17synthase 1 m143-3 D87663 284-774 98.7 + 0.41 pP38 NM_011957 321-82889.7 + 4.34 rp38² 1606-2204 85.0 + 1.91 Note: ¹Percentage of knockdownis calculated as (%) ²Efficient inhibition of mouse p38 expression wasobserved using the dsRNA derived from the 3′ UTR of rat p38, whichshares about 80% homology with the mouse sequence.

5 nM dsRNA was transfected into N2a cells. Three days aftertransfection, cells were harvested and total RNA was extracted forRT-PCR as described in Materials and Methods. As shown in FIG. 9A-F,gene specific dsRNA induces profound silencing of the cognate mRNA whilecontrol dsRNA-EGFP shows little or no silencing effect. To test whetherdsRNA induces non-specific silencing, we used GAPDH as our internalcontrol. No difference in the expression levels of GAPDH was observedbetween dsRNA-transfected and mock-transfected, indicating that therewas no non-specific silencing induced by long dsRNA in these mouseneuroblastoma cells under current experimental conditions (Fig A-F).

Real time PCR was performed using an iCycler Real-Time detection system(Bio-Rad Laboratories, Hercules, Calif.) to quantitatively measuredsRNA-induced gene-specific silencing (above table). The efficiency ofinhibition ranges from 65% to 98%, depending on the gene. These resultsfurther confirm that RNAi machinery is highly active in N2a cells andthe silencing effects we observed at the protein level are not due topost-translational mechanisms, but are mediated at the transcriptionallevel.

It has previously been reported that dsRNA-induced inhibition appears tobe transient when transfected into mammalian cells. The presentexperiments show dsRNA-induced inhibition that is both transient andtime-dependent. Different genes require different lengths ofpost-transfection time for efficient inhibition of gene expression. Forexample, dsRNA-p53 induced maximal inhibition in 2448 hours, while bothdsRNA-PARP and dsRNA-EGFP induce maximal inhibition in 96 hours. Thisphenomena can be best explained by the fact that the mRNA and protein ofdifferent genes have different stability and turn over time. Forexample, EGFP is known to be much more stable than p53 protein, whichhas a typical half-life of 20 minutes.

The effective dsRNAs used so far all corresponded to full-length orportions of the open reading frame and shared 100% sequence identitywith endogenous sequences. The question was then addressed whethereffective dsRNA can only be derived from the ORF and whether 100%sequence identity is required for efficient silencing of the cognatemessage.

The present experiments show that dsRNA corresponding to the partial ORFof mouse p38 induces efficient silencing of p38 mRNA in mouseneuroblastoma N2a cells 9 (FIG. 9C). It was then tested whether dsRNAcorresponding to the 3′UTR of the rat p38 gene (dsRNA-rat-p38), whichshares about 80% identity with the mouse p38 gene sequence in thatregion, also can induce gene-specific silencing in N2a cells. Indeed,dsRNA-rat-p38 induces efficient silencing of p38 mRNA in N2a cells,indicating that effective dsRNA is not restricted to sequences in theORF, and that 100% sequence identity is not required.

Recent studies show that long dsRNA induced gene-specific silencing innon-differentiated and embryonic cells. In our experiments, N2a cellsundergo serum withdrawal, which induces partial differentiation. It wasthen tested whether RNAi is active in fully differentiated N2a cells.Cells were transfected with 5 nM dsRNA and then incubated in Neurobasalmedium (Invitrogen) with N2 supplement (Invitrogen) in the presence of20 μM retinoid acid to induce neuronal differentiation. After threedays, N2a cells were fully differentiated with long processes. Theproliferation rate decreased dramatically after 2-3 days indifferentiation media, as measured with the amount of incorporated BrdU.Real-time PCR was then used to test whether long dsRNA can inducegene-specific silencing in these fully differentiated neuronal cells.The 14-3-3 mRNA level was inhibited by about 80% in differentiated cellstransfected with ds14-3-3 RNA compared with cells transfected withdsEGFP (FIG. 10). These results indicate that RNAi is not restricted tonon-differentiated cells or cells of embryonic origin, but active inthese fully-differentiated neuronal cells.

Materials and Methods

Materials and methods are as described in Example 4, except for thefollowing:

Quantitation of Expression. SYBR Green real-time PCR amplifications wereperformed in a iCycler Real-Time Detection System (Bio-Rad Laboratories,Hercules, Calif.). Primers were designed using Primer3 developed by theWhitehead Institute for Biomedical Research and the primers (OperonTechnologies, Alameda, Calif.) concentrations were optimized for usewith the SYBR green PCR master mix reagents kit. The sizes of theamplicons were checked by running out the PCR product on a 1.5% agarosegel. The thermal profile for all SYBR Green PCRs was 50° C. for 2minutes and 95° C. for 10 minutes, followed by 45 cycles of 95° C. for15 seconds, 60° C. for 30 seconds followed by 72° C. for 40 seconds. Thestandard curves are used to calculate the PCR efficiency of the primerset. As an endogenous reference we used glyceraldehydes-3-phosphatedehydrogenase (GAPDH), recently demonstrated to be a suitable controlgene for studying brain injury with real-time RT-PCR. All PCR reactionsperformed in triplicates. Quantification was performed using thecomparative cycle threshold (CT) method, where CT is defined as thecycle number at which fluorescence reaches a set threshold value. Thetarget transcript was normalized to an endogenous reference(simultaneous triplicate GAPDH reactions), and relative differences werecalculated using the PCR efficiencies.

DsRNA corresponding to FASTK, caspase-3, p53, 14-3-3, p38, and3-hydroxy-3-methylglutaryl-Coenzyme A synthase (synthase) were generatedas follows. Briefly, partial sequences of these mouse genes were clonedusing RT-PCR and inserted into PCR4.0TOPO and serve as templates for invitro transcription. The oligonucleotides used for PCR to generatepartial clone of FASTK (accession #NM_(—)023229) are (SEQ ID NO:19)GTCTCCACCACCCAGCTCCATG and (SEQ ID NO:20) AGATGCTGACGAGGGTACTGCA. Theoligonucleotides used for PCR to generate partial clone of capspase-3(accession #NM_(—)009810) are (SEQ ID NO:21) TGGAGAACAACAAAACCTCAGTGGand (SEQ ID NO:22) CTGTTAACGCGAGTGAGAATGTGC. The oligonucleotides usedfor PCR to generate partial clone of p53 (accession #M13873) are (SEQ IDNO:23) ACCTCACTGCATGGACGATCTG and (SEQ ID NO:24) GCAGTTCAGGGCAAAGGACTTC.The oligonucleotides used for PCR to generate partial clone of 14-3-3(accession #D87663) are (SEQ ID NO:25) CGGCAAATGGTTGAAACTGA and (SEQ IDNO:26) CCTGCAGCGCTTCTTTATTCT. The oligonucleotides used for PCR togenerate partial clone of p38 (accession NM_(—)011951) are (SEQ IDNO:27) GCAGGAGAGGCCCACGTTCT and (SEQ ID NO:28) CATCATCAGTGTGCCGAGCCA.The oligonucleotides used for PCR to generate partial clone of3-hydroxy-3-methylglutaryl-Coenzyme A synthase 1 (accession #BC013443)are (SEQ ID NO:28) CGTGGTATCTGGTCAGAGTGGA and (SEQ ID NO:29)GCCAGACCACMCAGGAAGCAT. The oligonucleotides used for PCR to generatepartial clone of all dsRNA used for transfections are blunt-ended.

The primers used to quantify caspase-3 are: (SEQ ID NO:30)GTACGCGCACAAGCTAGAAT and (SEQ ID NO:31) AAAGTGGAGTCCAGGGAGAAG; forFASTK, the primers are (SEQ ID NO:32) GGTGGTCAAAGGTTGGAAGT and (SEQ IDNO:33) CCATTACGTGAGGAGTCAGTTC; for p53, the primers are (SEQ ID NO:34)GCGTAAACGCTTCGAGATG and (SEQ ID NO:35) AGTAGACTGGCCCTTCTTGGT; forsynthase, the primers are (SEQ ID NO:36) CTGGCCAGTGGTAAATGTACTG and (SEQID NO:37) CTCTGCCTTTTGCTGTCAGA; for 14-3-3, the primers are (SEQ IDNO:38) CGCTGTGGACCTCAGACAT and (SEQ ID NO:39) GGGGTAGTCAGAGATGGTTTCT;for p38, the primers are (SEQ ID NO:40) GTGGAAGAGCCTGACCTATGAT and (SEQID NO:41) CCCCTCACAGTGAAGTGAGATA.

For the purposes of clarity and understanding, the invention has beendescribed in these examples and the above disclosure in some detail. Itwill be apparent, however, that certain changes and modifications may bepracticed within the scope of the appended claims. All publications andpatent applications listed herein are hereby incorporated by referencein their entirety for all purposes to the same extent as if each were soindividually denoted.

1. A method for producing and identifying an active double stranded RNA(dsRNA) which attenuates a desired gene expression in a cell, saidmethod comprising: (a) producing a plurality of cDNA, wherein each cDNAcomprises at least a portion of a gene that is expressed in a cell; (b)producing a candidate dsRNA from at least one of the cDNA; (c)introducing the candidate dsRNA into a reference cell; and (d)identifying an active dsRNA by determining whether the candidate dsRNAmodulates a desired candidate gene expression in the reference cell. 2.The method of claim 1 further comprising producing the identified activedsRNA from a corresponding cDNA of step (a).
 3. The method of claim 1,wherein said step of identifying the active dsRNA comprises: (a)selecting a candidate gene, wherein the candidate gene is a gene that isexpressed in a test cell and/or a control cell, and/or is expressed at adetectably different level with respect to the test cell and the controlcell, and the test cell and control cell differ with respect to acellular characteristic; and (b) identifying whether the candidate dsRNAis an active dsRNA by determining whether down-regulation of expressionof the candidate gene in a reference cell has a functional effect in thereference cell, wherein the determining step comprises: (i) introducingthe candidate dsRNA which is substantially identical to at least a partof the candidate gene into the reference cell; and (ii) detecting analteration in a cellular activity or a cellular state in the referencecell, alteration indicating that the candidate gene plays a functionalrole in the reference cell and is an active dsRNA.
 4. The method ofclaim 1, wherein said step of producing a plurality of cDNA comprises:(i) isolating at least one mRNA from the cell, and (ii) producing adouble-stranded cDNA from the isolated mRNA by reverse transcription. 5.The method of claim 4, wherein step of producing a plurality of cDNAfurther comprises producing cDNAs of a similar length by digesting cDNAof said step (ii) with a restriction enzyme.
 6. The method of claim 5,wherein said step (b) of producing the candidate dsRNA comprises: (i)producing a plasmid or PCR fragment from the cDNA, and (ii) producingthe candidate dsRNA from the plasmid or PCR fragment.
 7. The method ofclaim 6, wherein the plurality of cDNA comprises at least a portion ofsubstantially all genes that are actively expressed in the cell.
 8. Themethod of claim 6, wherein the desired affect of the candidate dsRNA onthe reference cell is a result of the candidate dsRNA attenuatingexpression of a candidate gene in the reference cell.
 9. The method ofclaim 8, wherein the candidate dsRNA has complete sequence identity withthe candidate gene over at least 100 nucleotides.
 10. The method ofclaim 9, wherein the candidate dsRNA has partial sequence identity withthe candidate gene over at least 100 nucleotides.
 11. The method ofclaim 10, wherein said partial sequence identity correlates with theuntranslated region of said candidate gene.
 12. The method of claim 11,wherein the candidate dsRNA is at least 500 nucleotides in length. 13.The method of claim 12, wherein the candidate dsRNA is the length of thecandidate cDNA.
 14. A method for identifying and validating the effectof an active double-stranded RNA (dsRNA) which attenuates a desired geneexpression in a cell, said method comprising: (a) producing a candidatedsRNA which comprises at least a portion of a candidate gene that isexpressed in a control cell; (b) introducing the candidate dsRNA into areference cell; and (c) identifying whether the candidate dsRNA is anactive dsRNA by detecting an alteration in a cellular activity or acellular state in the reference cell, alteration indicating that thecandidate gene plays a functional role in the reference cell and is anactive dsRNA.
 15. The method of claim 14, wherein said step of producingthe candidate dsRNA comprises: (i) producing a cDNA from a mRNA of thecontrol cell such that the cDNA comprises at least a portion of the genethat is expressed in the control cell; and (ii) producing the candidatedsRNA from at least one of the cDNA of said step (i).
 16. The method ofclaim 14, wherein the candidate gene is a gene that is expressed in atest cell and/or the control cell, and/or is expressed at a detectablydifferent level with respect to the test cell and the control cell, andthe test cell and control cell differ with respect to a cellularcharacteristic.
 17. A method for correlating genes and gene function,said method comprising: (a) producing a plurality of candidate dsRNAsfrom a plurality of cDNAs of a control cell such that each candidatedsRNA comprises at least a portion of a gene that is expressed I thecontrol cell; (b) introducing each of the candidate dsRNA into aplurality of separate reference cell each having a gene expressionsimilar to the control cell in step (a); and (c) identifying whichcandidate dsRNA is an active dsRNA by detecting an alteration in acellular activity or a cellular state in the reference cell, desiredalteration indicating that the gene corresponding to the candidate dsRNAplays a functional role in the reference cell.
 18. The method of claim17, wherein the plurality of cDNAs is produced from a plurality of mRNAswhich are produced by the control cell.
 19. The method of claim 18,wherein said step of producing a plurality of cDNA comprises: (i)isolating at least one mRNA from the cell: (ii) producing adouble-stranded cDNA from the isolated mRNA by reverse transcription;(iii) producing cDNAs of a similar length by digesting cDNA of said step(ii) with a restriction enzyme; and (iv) producing a plasmid or PCRfragment from the cDNA of said step (iii).
 20. The method of claim 19,wherein the candidate dsRNA is produced by transcribing the plasmid cDNAor PCR fragment of said step (iv).
 21. The method of claim, 19, whereinthe plurality of cDNA comprises at least a portion of substantially allgenes that are actively expressed in the cell.
 22. The method of claim19, wherein the restriction enzyme is selected from the group consistingof Dpnl and Rsal.
 23. The method of claim 17, wherein said step ofproducing the plurality of candidate dsRNAs comprises: (A) selecting acandidate gene, wherein the candidate gene is a gene that is expressedin a test cell and/or a control cell, and/or is expressed at adetectably different level with respect to the test cell and the controlcell, and the test cell and control cell differ with respect to acellular characteristic; and (B) producing the plurality of candidatedsRNAs, wherein each candidate dsRNA is substantially identical to atleast a part of the candidate gene.
 24. The method of claim 23, whereinthe candidate gene is selected from a normalized library prepared fromcells of the same type as the test cell or the control cell and ispresent in low abundance in the normalized library.
 25. The method ofclaim 23, wherein the candidate gene is a differentially expressed geneselected from a subtracted library that is enriched for genes that aredifferentially expressed with respect to the test cell and the controlcell.
 26. The method of claim 25, wherein the subtracted library is alsonormalized and the candidate gene is one of the genes that is bothpresent in low abundance and differentially expressed in the subtractedand normalized library.
 27. The method of claim 23, wherein said step ofselecting the candidate gene comprises: (i) preparing (A) atester-normalized cDNA library which is a normalized library preparedfrom test cells; (B) a driver-normalized cDNA library which is anormalized library prepared from control cells; (C) a tester-subtractedcDNA library which is enriched in one or more genes that areup-regulated with respect to the test cell and the control cell, and (D)a driver-subtracted cDNA library which is enriched in one or more genesthat are down-regulated with respect to the test cell and the controlcell; and (ii) identifying one or more clones from the normalizedlibraries and/or the subtracted libraries, wherein the candidate gene isone of the clones identified.
 28. The method of claim 27, wherein saidstep of identifying one or more clones from the normalized librariescomprises: (A) contacting clones from the tester-normalized cDNA librarywith labeled probes derived from mRNA from test cells and contactingclones from the driver-normalized cDNA library with labeled probesderived from mRNA from control cells under conditions whereby probesspecifically hybridize with complementary clones to form a first set ofhybridization complexes; and (B) detecting at least one hybridizationcomplex from the first set of hybridization complexes to identify aclone from one of the normalized libraries which is present in lowabundance.
 29. The method of claim 27, wherein said step of identifyingone or more clones from the subtracted libraries comprises: (A)contacting clones from the tester-subtracted cDNA library and contactingclones from the driver-subtracted cDNA library with a population oflabeled probes under conditions whereby probes fro the population ofprobes specifically hybridize with complementary clones to form a secondset of hybridization complexes, and wherein the population of labeledprobes is derived from mRNA from test cells and control cells; and (B)detecting at least one hybridization complex from the second set ofhybridization complexes to identify a clone from one of the subtractedlibraries which is differentially expressed above a threshold level withrespect to the subtracted libraries.
 30. The method of claim 23, whereinthe cellular characteristic is cell health, the test cell is a diseasedcell and the control cell is a healthy cell, and the candidate gene ispotentially correlated with a disease.
 31. The method of claim 30,wherein the test cell is obtained from a mammal that has had a stroke oris at risk for stroke.
 32. The method of claim 30, wherein the test cellis obtained from a mammal that has a neurological disease or developphenotypes mimicking human neurological diseases.
 33. The method ofclaim 23, wherein the cellular characteristic is stage of developmentand the test cell and the control cell are at different stages ofdevelopment, and the candidate gene is potentially correlated withmediating the change between the different stages of development. 34.The method of claim 23, wherein the cellular characteristic is cellulardifferentiation and the candidate gene is potentially correlated withcontrolling cellular differentiation.
 35. The method of claim 23,wherein the candidate gene is an endogenous gene of the reference cell.36. The method of claim 23, wherein the candidate gene is present in thereference cell as an extrachromosomal gene.
 37. The method of claim 17,wherein the reference cell is part of a cell culture.
 38. The method ofclaim 17, wherein the reference cell is part of a tissue.
 39. The methodof claim 17, wherein the reference cell is part of an organism.
 40. Themethod of claim 17, wherein the reference cell is part of an embryo. 41.The method of claim 17, wherein the reference cell is a mammalian cell.42. The method of claim 17, wherein the reference cell is a neural orglial cell.
 43. The method of claim 42, wherein the reference cell is aneuroblastoma cell.
 44. The method of claim 43, wherein the referencecell is useful as a model system for investigating neurological diseasein humans.
 45. The method of claim 44, wherein the reference cell hasincreased sensitivity to N-methyl-D-aspartate, β-amyloid, peroxide,oxygen-glucose deprivation, or combinations thereof.
 46. The method ofclaim 45, wherein the detecting step comprises detecting a decrease incellular sensitivity to N-methyl-D-aspartate, β-amyloid peroxide,oxygen-glucoe deprivation, or combinations thereof.
 47. The method ofclaim 17, wherein the detecting step comprises detecting modulation ofligand binding to a protein.
 48. The method of claim 17, wherein thereference cell is a part of an organism and the detecting step comprisesdetecting a change in phenotype.
 49. The method of claim 17, wherein thedetermining step comprises determining whether interference withexpression of the candidate gene in the reference cell is correlatedwith alteration of a cellular activity or cellular state.
 50. The methodof claim 49, wherein interference is achieved by introducing adouble-stranded RN A into the reference cell that can specificallyhybridize to the candidate gene.
 51. The method of claim 17, wherein thedetermining step comprises determining whether the protein encoded bythe candidate gene binds to another protein to form a complex that canbe coimmunoprecipitated.